The Multivac — Ask any model, routed by evaluation

◈ MULTIVAC
OverviewEvaluationsLeaderboardModel PulseHistoryCompareExportAPI
Routing APIExport APISign in
← Evaluations/EVAL-20260402-172828
reasoning
Apr 02, 2026REASON-020
Explain Godel's First Incompleteness Theorem to someone who understands basic logic but not formal mathematics. Then: (1) What does it actually imply about AI? (Hint: less than most people think.) (2) Some people claim Godel's theorem means AI can never match human intelligence. Evaluate this claim rigorously. (3) Does Godel's theorem apply to neural networks? Why or why not?
Winner
Grok 4.20
openrouter
9.09
WINNER SCORE
matrix avg: 8.32
↓ results.json↓ report.mdFull dataset (CSV) →
10×10 Judgment Matrix · 87 judgments
OPEN DATA
Judge ↓ / Respondent →Grok 4.20Gemini 3.1 ProDeepSeek V4Claude Opus 4.6GPT-5.4Claude Sonnet 4.6MiMo-V2-FlashGPT-OSS-120BGemini 2.5 FlashMiniMax M2.5
Grok 4.20—7.58.68.88.88.88.48.78.48.7
Gemini 3.1 Pro10.0—7.08.47.97.79.47.36.88.6
DeepSeek V48.78.4—8.88.49.08.78.88.79.0
Claude Opus 4.69.26.07.2—9.09.07.88.07.88.8
GPT-5.49.02.98.67.7—7.88.06.77.18.2
Claude Sonnet 4.6·5.88.29.78.8—8.68.78.88.8
MiMo-V2-Flash9.08.19.09.29.08.8—8.88.88.8
GPT-OSS-120B8.46.68.38.4··8.0—8.78.3
Gemini 2.5 Flash9.47.88.89.09.09.09.09.0—9.0
MiniMax M2.59.06.07.88.78.37.99.07.77.7—