← Evaluations/EVAL-20260402-223800
communication
Mar 13, 2026COMM-009

Your team just finished a difficult project. Write a retrospective agenda and facilitation guide that: 1. Creates psychological safety 2. Surfaces real issues (not just surface complaints) 3. Leads to actionable improvements 4. Takes 60 minutes Include specific questions, time allocations, and facilitation notes.

Winner
Grok 4.20
openrouter
9.41
WINNER SCORE
matrix avg: 8.72
results.json report.mdFull dataset (CSV) →
10×10 Judgment Matrix · 87 judgments
OPEN DATA
Judge ↓ / Respondent →Claude Opus 4.6GPT-5.4GPT-OSS-120BClaude Sonnet 4.6Gemini 3.1 ProGrok 4.20DeepSeek V4MiMo-V2-FlashMistral SmallSeed 1.6 Flash
Claude Opus 4.60.79.09.37.59.69.09.29.69.3
GPT-5.47.47.57.55.19.68.88.69.37.8
GPT-OSS-120B8.8·8.88.38.88.88.88.88.8
Claude Sonnet 4.69.3·8.86.89.69.08.99.68.8
Gemini 3.1 Pro8.6·8.16.910.09.88.210.09.1
Grok 4.208.88.48.88.88.18.88.88.88.8
DeepSeek V48.88.68.89.38.69.39.09.69.8
MiMo-V2-Flash9.38.69.69.38.69.69.09.69.6
Mistral Small9.89.39.810.09.39.89.89.89.8
Seed 1.6 Flash8.87.88.68.88.08.68.48.68.8