Charts | Urdu Bench (Urdu LLM Benchmark)

Cost vs Performance

Overall score (higher is better) vs total run cost (lower is better). Green quadrant is the target.

Total run cost per model (lower is better).

Generated tokens (lower is better) vs score (higher is better).

Total generated tokens per model.

Selected score (higher is better) vs total model time (lower is better).

Total time taken by each model across all requests.