code
Apr 03, 2026SLM-001Summarize this 500-word passage in exactly 50 words while retaining all key claims: [Passage about climate change policy]. This tests whether small models can do precise length-constrained summarization.
Winner
Qwen 3 32B
openrouter
7.87
WINNER SCORE
matrix avg: 5.81
10×10 Judgment Matrix · 56 judgments
OPEN DATA
| Judge ↓ / Respondent → | Qwen 3 32B | Nemotron 3 Super | Devstral Small | Gemma 3 27B | Llama 4 Scout | Granite 4.0 Micro | Gemma 3n 4B | Qwen 3 8B | Kimi K2.5 |
|---|---|---|---|---|---|---|---|---|---|
| Qwen 3 32B | — | 4.8 | 8.1 | 2.0 | 8.3 | 4.5 | · | 8.3 | 7.0 |
| Nemotron 3 Super | 5.2 | — | 5.3 | 1.8 | 3.2 | 1.8 | 0.4 | 7.8 | 5.2 |
| Devstral Small | 9.3 | 2.0 | — | · | 7.8 | 7.5 | · | 9.3 | 2.0 |
| Gemma 3 27B | 8.3 | 6.0 | 8.1 | — | 8.1 | 4.8 | 2.4 | 8.4 | 7.0 |
| Llama 4 Scout | 8.3 | 6.0 | 7.7 | 1.0 | — | 5.3 | · | 8.3 | 1.0 |
| Granite 4.0 Micro | 8.1 | 8.1 | 8.3 | 7.5 | 8.1 | — | 8.3 | 8.3 | 5.0 |
| Gemma 3n 4B | 8.1 | 4.5 | 7.3 | · | 8.1 | 3.4 | — | · | 4.5 |
| Qwen 3 8B | 8.1 | 6.5 | 7.8 | · | 7.8 | 2.9 | · | — | 7.0 |
| Kimi K2.5 | · | · | · | · | · | · | · | · | — |