Cost vs Performance
Overall score (higher is better) vs total run cost (lower is better). Green quadrant is the target.
Token Use
Total generated tokens per model. When available, we split tokens into the final answer vs extra deliberation.
Model Time
The total time taken for the model to complete all requests.