← Evaluations/EVAL-20260402-143013
code
Apr 02, 2026CODE-024

Implement an HTTP/1.1 server from raw TCP sockets in Python (no http.server, no frameworks). Support: GET and POST methods, proper header parsing, chunked transfer encoding, keep-alive connections, static file serving, and a 404 handler. Handle malformed requests gracefully.

Winner
MiniMax M2.5
openrouter
8.29
WINNER SCORE
matrix avg: 6.61
results.json report.mdFull dataset (CSV) →
10×10 Judgment Matrix · 86 judgments
OPEN DATA
Judge ↓ / Respondent →Gemini 3.1 ProGPT-5.4Grok 4.20Claude Opus 4.6Claude Sonnet 4.6DeepSeek V4GPT-OSS-120BGemini 3MiniMax M2.5MiMo-V2-Flash
Gemini 3.1 Pro6.07.04.94.15.44.17.69.64.3
GPT-5.41.65.02.32.66.23.36.86.83.3
Grok 4.203.66.66.08.76.85.86.86.86.0
Claude Opus 4.61.27.46.46.56.45.37.47.5·
Claude Sonnet 4.61.98.27.06.07.07.27.88.26.8
DeepSeek V48.29.08.68.08.68.88.68.88.8
GPT-OSS-120B1.15.07.34.36.08.68.08.64.5
Gemini 32.99.29.27.58.89.28.49.68.6
MiniMax M2.5·8.17.25.85.5·6.88.25.4
MiMo-V2-Flash7.7·8.25.88.87.88.68.28.9