← Evaluations/EVAL-20260207-141157
code
Feb 10, 2026CODE-005

Convert this Python code to idiomatic Rust. The code must compile, handle errors properly, and follow Rust best practices. ```python from dataclasses import dataclass from typing import Optional, List from datetime import datetime @dataclass class Task: id: int title: str completed: bool due_date: Optional[datetime] tags: List[str] class TaskManager: def __init__(self): self.tasks = [] self.next_id = 1 def add_task(self, title: str, due_date: Optional[datetime] = None, tags: List[str] = None) -> Task: task = Task( id=self.next_id, title=title, completed=False, due_date=due_date, tags=tags or [] ) self.tasks.append(task) self.next_id += 1 return task def complete_task(self, task_id: int) -> bool: for task in self.tasks: if task.id == task_id: task.completed = True return True return False def get_overdue(self) -> List[Task]: now = datetime.now() return [t for t in self.tasks if t.due_date and t.due_date < now and not t.completed] ```

Winner
Claude Opus 4.5
Anthropic
9.65
WINNER SCORE
matrix avg: 8.00
results.json report.mdFull dataset (CSV) →
10×10 Judgment Matrix · 100 judgments
OPEN DATA
Judge ↓ / Respondent →Grok Code FastGemini 3MiniMax M2GPT-5.2-CodexClaude Opus 4.5Gemini 3Claude Sonnet 4.5GLM-4-7DeepSeek V3.2Grok 3 (Direct)
Grok Code Fast1.69.89.89.88.69.82.09.810.0
Gemini 30.00.00.00.00.00.00.00.00.0
MiniMax M29.01.96.59.67.89.20.07.89.8
GPT-5.2-Codex8.81.47.58.88.88.60.05.88.2
Claude Opus 4.59.21.28.88.49.29.20.08.69.2
Gemini 39.82.39.89.89.89.80.09.89.8
Claude Sonnet 4.59.60.48.68.39.89.30.09.69.6
GLM-4-710.00.48.89.89.89.39.39.310.0
DeepSeek V3.29.60.87.69.410.08.69.28.69.6
Grok 3 (Direct)9.61.98.68.89.69.19.66.59.0