← Evaluations/EVAL-20260402-213046
analysis
Apr 02, 2026ANALYSIS-029

Your startup generates 80% of its revenue through an API that depends on OpenAI's GPT models. OpenAI announces they're launching a competing product. (1) Assess the platform risk. (2) What signals should you have been watching? (3) Design a 90-day emergency plan. (4) How should startups building on top of AI platforms structure their businesses to minimize this risk from day one?

Winner
Grok 4.20
openrouter
9.14
WINNER SCORE
matrix avg: 8.69
results.json report.mdFull dataset (CSV) →
10×10 Judgment Matrix · 88 judgments
OPEN DATA
Judge ↓ / Respondent →Gemini 3.1 ProClaude Opus 4.6MiniMax M2.5GPT-5.4DeepSeek V4MiMo-V2-FlashClaude Sonnet 4.6Grok 4.20GPT-OSS-120BGemini 3
Gemini 3.1 Pro8.39.87.59.610.08.19.67.310.0
Claude Opus 4.67.79.29.28.88.98.89.28.38.8
MiniMax M2.56.18.68.68.29.0·9.07.68.3
GPT-5.46.98.29.08.08.36.88.66.08.2
DeepSeek V48.69.08.89.09.89.09.68.89.0
MiMo-V2-Flash8.89.29.39.08.89.09.09.09.3
Claude Sonnet 4.68.69.08.89.28.38.89.08.88.6
Grok 4.208.68.88.89.08.68.88.88.68.6
GPT-OSS-120B6.59.08.88.18.38.08.38.78.0
Gemini 38.8·9.69.69.49.89.69.89.6