← Evaluations/EVAL-20260315-031910
code
Mar 15, 2026EVAL-20260315-031910

Write a Python function that returns the second largest value from a list of integers. Handle edge cases.

Winner
Qwen 3 32B
openrouter
9.66
WINNER SCORE
matrix avg: 9.26
results.json report.mdFull dataset (CSV) →
10×10 Judgment Matrix · 79 judgments
OPEN DATA
Judge ↓ / Respondent →Qwen 3 32BDevstral SmallGemma 3 27BLlama 4 ScoutPhi-4 14BGranite 4.0 MicroQwen 3 8BMistral Nemo 12BLlama 3.1 8BKimi K2.5
Qwen 3 32B·9.38.69.09.89.0·9.39.6
Devstral Small9.49.49.68.89.49.69.69.49.8
Gemma 3 27B9.89.69.49.49.29.19.29.69.8
Llama 4 Scout10.09.39.68.49.39.69.39.39.8
Phi-4 14B9.88.49.69.69.49.69.69.09.8
Granite 4.0 Micro9.38.68.88.68.68.68.68.48.8
Qwen 3 8B9.89.89.69.49.69.69.29.39.8
Mistral Nemo 12B10.08.79.39.39.48.99.38.79.1
Llama 3.1 8B9.39.49.39.18.49.49.28.29.3
Kimi K2.5·········