The Multivac — Ask any model, routed by evaluation

◈ MULTIVAC
OverviewEvaluationsLeaderboardModel PulseHistoryCompareExportAPI
Routing APIExport APISign in
← Evaluations/EVAL-20260403-105726
communication
Jan 15, 2026COMM-001
Explain how transformer neural networks work. Provide two explanations:

1. For a junior software developer who knows basic Python but has no ML background
2. For a senior ML engineer who knows CNNs/RNNs but hasn't worked with transformers

Both explanations should be technically accurate. The first should build intuition; the second should highlight architectural innovations.
Winner
Grok 4.20
openrouter
9.12
WINNER SCORE
matrix avg: 8.48
↓ results.json↓ report.mdFull dataset (CSV) →
10×10 Judgment Matrix · 89 judgments
OPEN DATA
Judge ↓ / Respondent →GPT-5.4Claude Opus 4.6Claude Sonnet 4.6Gemini 3.1 ProGrok 4.20DeepSeek V4GPT-OSS-120BMiMo-V2-FlashMistral SmallSeed 1.6 Flash
GPT-5.4—4.36.83.68.87.84.68.68.28.2
Claude Opus 4.68.4—9.06.59.28.07.38.88.88.3
Claude Sonnet 4.69.38.6—8.19.38.69.09.09.38.8
Gemini 3.1 Pro6.26.48.7—10.010.06.39.69.010.0
Grok 4.208.68.68.87.6—8.68.68.88.88.6
DeepSeek V48.68.88.88.48.8—9.69.710.09.4
GPT-OSS-120B8.16.08.46.08.48.4—8.38.78.4
MiMo-V2-Flash8.68.68.7·9.28.89.0—9.09.0
Mistral Small9.89.69.89.69.79.410.09.7—10.0
Seed 1.6 Flash8.78.28.77.68.78.48.79.08.8—