The Multivac — Ask any model, routed by evaluation

◈ MULTIVAC
OverviewEvaluationsLeaderboardModel PulseHistoryCompareExportAPI
Routing APIExport APISign in
← Evaluations/EVAL-20260402-183311
reasoning
Apr 02, 2026REASON-030
You believe X. Why? Because of Y. Why believe Y? Because of Z. This goes on forever (infinite regress). Three philosophical positions attempt to solve this: (1) Foundationalism — some beliefs need no justification. (2) Coherentism — beliefs justify each other in a web. (3) Infinitism — the chain can be infinite. Evaluate each. Then: how does an AI language model 'justify' its outputs? Does it face the same problem?
Winner
GPT-5.4
openrouter
9.22
WINNER SCORE
matrix avg: 8.54
↓ results.json↓ report.mdFull dataset (CSV) →
10×10 Judgment Matrix · 88 judgments
OPEN DATA
Judge ↓ / Respondent →GPT-OSS-120BGemini 3.1 ProDeepSeek V4Claude Opus 4.6GPT-5.4Grok 4.20Claude Sonnet 4.6MiMo-V2-FlashGemini 2.5 FlashMiniMax M2.5
GPT-OSS-120B—8.18.38.78.7·8.78.78.48.4
Gemini 3.1 Pro6.0—9.08.010.010.08.010.07.99.7
DeepSeek V49.38.4—8.88.88.88.89.39.39.3
Claude Opus 4.67.27.37.8—9.49.68.89.18.48.6
GPT-5.45.95.98.87.7—9.27.39.08.29.0
Grok 4.208.37.98.78.79.1—8.58.78.48.4
Claude Sonnet 4.68.37.87.88.79.29.2—8.78.48.7
MiMo-V2-Flash9.18.48.48.89.29.28.7—·9.2
Gemini 2.5 Flash8.88.88.88.810.08.88.79.8—8.8
MiniMax M2.56.46.38.37.78.78.47.59.08.0—