← Evaluations/EVAL-20260402-181953
reasoning
Apr 02, 2026REASON-028

(1) Explain P vs NP to a smart non-technical person using only analogies and examples. (2) Why do most computer scientists believe P ≠ NP? (3) If someone proved P = NP tomorrow, what would the practical consequences be? (Be specific about cryptography, optimization, and AI.) (4) Would P = NP mean all problems are easy to solve? Why or why not?

Winner
GPT-5.4
openrouter
9.12
WINNER SCORE
matrix avg: 8.17
results.json report.mdFull dataset (CSV) →
10×10 Judgment Matrix · 88 judgments
OPEN DATA
Judge ↓ / Respondent →GPT-OSS-120BGemini 3.1 ProDeepSeek V4Claude Opus 4.6GPT-5.4Grok 4.20Claude Sonnet 4.6MiMo-V2-FlashGemini 2.5 FlashMiniMax M2.5
GPT-OSS-120B·8.37.18.88.77.38.78.47.6
Gemini 3.1 Pro7.28.27.010.09.67.27.38.38.8
DeepSeek V48.88.48.89.09.09.09.09.08.4
Claude Opus 4.68.06.36.89.09.28.38.37.36.8
GPT-5.45.24.86.56.17.36.07.56.26.7
Grok 4.208.67.58.38.89.08.68.88.68.3
Claude Sonnet 4.68.66.57.59.09.09.08.38.27.5
MiMo-V2-Flash8.68.69.09.09.09.08.48.68.6
Gemini 2.5 Flash9.48.48.89.09.49.49.29.48.8
MiniMax M2.5·7.17.87.88.88.87.98.88.1