← Evaluations/EVAL-20260402-124253
code
Mar 03, 2026CODE-008

Implement a production-ready API rate limiter with the following requirements: 1. Token bucket algorithm 2. Support for different rate limits per API key 3. Redis backend for distributed systems 4. Graceful degradation when Redis is unavailable 5. Proper async support 6. Comprehensive logging Include the main class, Redis integration, and a FastAPI middleware example.

Winner
Gemini 3 Flash Preview
Google
8.28
WINNER SCORE
matrix avg: 6.70
results.json report.mdFull dataset (CSV) →
10×10 Judgment Matrix · 87 judgments
OPEN DATA
Judge ↓ / Respondent →GPT-5.4Gemini 3Claude Opus 4.6Gemini 3.1 ProClaude Sonnet 4.6Grok 4.20DeepSeek V4GPT-OSS-120BMiniMax M2.5MiMo-V2-Flash
GPT-5.47.83.71.64.05.44.83.02.95.2
Gemini 39.69.63.08.89.39.28.87.09.0
Claude Opus 4.67.07.80.86.08.07.06.33.76.6
Gemini 3.1 Pro7.08.85.45.88.34.85.83.27.8
Claude Sonnet 4.68.28.68.41.68.37.57.25.77.4
Grok 4.208.47.87.43.6·6.66.45.86.6
DeepSeek V48.6·8.68.48.88.89.27.69.0
GPT-OSS-120B5.78.65.12.45.87.06.25.77.5
MiniMax M2.56.37.56.81.06.5·7.46.57.8
MiMo-V2-Flash8.69.38.05.58.68.37.88.63.2