← Evaluations/EVAL-20260402-124935
code
Mar 10, 2026CODE-009

This Python application has a memory leak. Find it and explain the fix. ```python import threading import time from functools import lru_cache class EventProcessor: _instances = [] def __init__(self, name): self.name = name self.callbacks = [] self._lock = threading.Lock() EventProcessor._instances.append(self) def register_callback(self, func): self.callbacks.append(func) @lru_cache(maxsize=10000) def process(self, event_data): results = [] for callback in self.callbacks: result = callback(event_data) results.append(result) return tuple(results) def __del__(self): print(f"Processor {self.name} deleted") def create_processor_for_request(request_id): processor = EventProcessor(f"processor_{request_id}") processor.register_callback(lambda x: x.upper()) processor.register_callback(lambda x: len(x)) return processor # Simulated request handling def handle_request(request_id, data): processor = create_processor_for_request(request_id) return processor.process(data) # This runs for hours... for i in range(1000000): result = handle_request(i, f"event_data_{i}") time.sleep(0.001) ```

Winner
GPT-5.4
openrouter
9.54
WINNER SCORE
matrix avg: 8.07
results.json report.mdFull dataset (CSV) →
10×10 Judgment Matrix · 89 judgments
OPEN DATA
Judge ↓ / Respondent →MiniMax M2.5Gemini 3.1 ProGrok 4.20GPT-5.4Claude Opus 4.6Claude Sonnet 4.6DeepSeek V4GPT-OSS-120BGemini 3MiMo-V2-Flash
MiniMax M2.56.88.410.08.88.68.68.68.08.0
Gemini 3.1 Pro4.88.510.08.86.87.06.18.53.6
Grok 4.206.55.89.07.5·6.68.87.86.0
GPT-5.45.83.38.68.08.06.55.38.24.8
Claude Opus 4.65.05.39.29.88.96.39.38.25.5
Claude Sonnet 4.66.85.99.09.38.87.89.28.46.0
DeepSeek V48.89.39.49.69.69.89.69.69.6
GPT-OSS-120B8.25.58.68.88.68.17.58.67.5
Gemini 39.69.310.010.010.010.09.29.86.8
MiMo-V2-Flash8.68.49.89.39.29.38.69.38.8