← Evaluations/EVAL-20260402-171914
reasoning
Apr 02, 2026REASON-019

You must choose between three investments. Investment A returns 10% with 90% probability, -50% with 10% probability. Investment B returns 5% with certainty. Investment C returns 100% with 20% probability, 0% with 80% probability. (1) Rank them by expected value. (2) Rank them by the Kelly criterion. (3) You have $10,000 — your entire savings. Does this change your answer? Why? (4) Now you have $10,000,000. Does it change again? Derive the general principle.

Winner
GPT-5.4
openrouter
9.37
WINNER SCORE
matrix avg: 7.15
results.json report.mdFull dataset (CSV) →
10×10 Judgment Matrix · 87 judgments
OPEN DATA
Judge ↓ / Respondent →Gemini 3.1 ProDeepSeek V4Claude Opus 4.6Grok 4.20GPT-5.4Claude Sonnet 4.6MiMo-V2-FlashGPT-OSS-120BGemini 2.5 FlashMiniMax M2.5
Gemini 3.1 Pro5.68.3·10.09.44.53.84.09.0
DeepSeek V48.39.79.29.49.79.27.98.19.4
Claude Opus 4.61.65.86.79.28.94.04.34.65.0
Grok 4.203.46.68.88.88.85.35.74.67.6
GPT-5.40.74.07.87.07.64.02.64.86.3
Claude Sonnet 4.61.96.89.47.5·4.85.96.06.0
MiMo-V2-Flash3.08.29.08.69.09.44.88.08.8
GPT-OSS-120B2.88.48.06.88.78.8·6.58.6
Gemini 2.5 Flash9.08.29.710.010.010.09.07.79.0
MiniMax M2.52.18.29.09.09.89.09.46.08.4