← Evaluations/EVAL-20260402-151829
code
Apr 02, 2026CODE-030

Design a GraphQL schema for a social media platform with users, posts, comments, likes, and follows. Address: N+1 query problem with DataLoader pattern, cursor-based pagination, proper input validation, rate limiting per field, and subscription for real-time updates. Include resolver implementations for the trickiest queries.

Winner
Gemini 3 Flash Preview
Google
8.24
WINNER SCORE
matrix avg: 7.15
results.json report.mdFull dataset (CSV) →
10×10 Judgment Matrix · 88 judgments
OPEN DATA
Judge ↓ / Respondent →DeepSeek V4GPT-5.4Claude Opus 4.6Gemini 3.1 ProClaude Sonnet 4.6Grok 4.20GPT-OSS-120BGemini 3MiniMax M2.5MiMo-V2-Flash
DeepSeek V49.08.87.88.68.89.49.08.68.8
GPT-5.46.84.71.92.65.44.77.21.95.0
Claude Opus 4.67.26.73.16.37.07.77.85.17.2
Gemini 3.1 Pro6.46.25.56.38.16.69.04.46.1
Claude Sonnet 4.68.08.68.05.78.28.88.36.57.8
Grok 4.207.89.08.75.87.98.78.27.07.8
GPT-OSS-120B·7.56.15.04.97.07.3·6.6
Gemini 39.69.69.07.18.38.88.88.38.8
MiniMax M2.57.56.95.84.06.37.38.38.07.2
MiMo-V2-Flash9.08.88.03.27.08.08.89.37.3