Urdu Bench
Loading live results…

Cost vs Performance

Overall score (higher is better) vs total run cost (lower is better). Green quadrant is the target.

Pure Cost

Total run cost per model (lower is better).

Total Tokens vs Performance

Generated tokens (lower is better) vs score (higher is better).

Total Tokens

Total generated tokens per model.

Model Time vs Performance

Selected score (higher is better) vs total model time (lower is better).

Total Model Times

Total time taken by each model across all requests.