← Evaluations/EVAL-20260402-231136
communication
Apr 02, 2026COMM-016

Explain how HTTPS works to someone who only knows that 'the lock icon means secure.' Cover: (1) What actually happens when you type https://example.com, (2) Why you need certificates, (3) What a man-in-the-middle attack is, (4) Why public WiFi is risky even with HTTPS. Use analogies only — no technical terms (no 'encryption,' 'certificate,' 'handshake'). Max 400 words.

Winner
Claude Opus 4.6
openrouter
9.00
WINNER SCORE
matrix avg: 8.11
results.json report.mdFull dataset (CSV) →
10×10 Judgment Matrix · 87 judgments
OPEN DATA
Judge ↓ / Respondent →GPT-OSS-120BClaude Opus 4.6GPT-5.4Claude Sonnet 4.6Gemini 3.1 ProGrok 4.20DeepSeek V4MiMo-V2-FlashMistral SmallSeed 1.6 Flash
GPT-OSS-120B8.78.48.27.88.78.08.77.18.3
Claude Opus 4.67.38.88.67.58.36.57.37.05.5
GPT-5.47.89.08.05.89.08.28.46.85.8
Claude Sonnet 4.68.29.08.07.08.37.57.86.45.0
Gemini 3.1 Pro8.19.89.8·8.28.26.54.84.2
Grok 4.208.38.38.39.06.38.37.47.56.3
DeepSeek V49.0·8.69.08.88.68.69.08.8
MiMo-V2-Flash9.09.29.09.28.69.09.09.08.6
Mistral Small9.69.49.49.48.89.49.09.48.8
Seed 1.6 Flash8.28.78.48.26.0·8.08.08.2