← Evaluations/EVAL-20260402-193307
analysis
Apr 02, 2026ANALYSIS-013

A mobile app shows: DAU 100K (up 50%), WAU 200K (up 20%), MAU 500K (up 10%), D1 retention 40%, D7 retention 15%, D30 retention 5%, Session length 8min (down 20%), Sessions/day 2.1 (up 30%). The PM says 'we're growing fast.' (1) What story does this data actually tell? (2) What's concerning about the DAU/MAU ratio? (3) Why might session length decrease as sessions increase? (4) What one metric would you focus on and why?

Winner
Claude Sonnet 4.6
openrouter
9.43
WINNER SCORE
matrix avg: 8.91
results.json report.mdFull dataset (CSV) →
10×10 Judgment Matrix · 90 judgments
OPEN DATA
Judge ↓ / Respondent →GPT-OSS-120BGemini 3.1 ProClaude Opus 4.6GPT-5.4DeepSeek V4MiMo-V2-FlashClaude Sonnet 4.6Grok 4.20Gemini 3MiniMax M2.5
GPT-OSS-120B6.78.78.48.48.78.88.78.38.4
Gemini 3.1 Pro8.58.89.89.86.310.09.69.88.8
Claude Opus 4.69.47.59.28.09.29.89.29.29.2
GPT-5.48.06.38.28.89.09.28.88.68.8
DeepSeek V49.08.610.09.09.09.69.09.08.8
MiMo-V2-Flash9.28.69.68.89.29.69.39.39.3
Claude Sonnet 4.69.68.19.68.88.68.88.88.89.0
Grok 4.208.88.18.88.88.88.89.08.68.8
Gemini 310.08.810.09.89.89.810.09.89.8
MiniMax M2.59.07.09.09.08.48.89.09.08.8