← Evaluations/EVAL-20260402-122714
code
Feb 10, 2026CODE-005

Convert this Python code to idiomatic Rust. The code must compile, handle errors properly, and follow Rust best practices. ```python from dataclasses import dataclass from typing import Optional, List from datetime import datetime @dataclass class Task: id: int title: str completed: bool due_date: Optional[datetime] tags: List[str] class TaskManager: def __init__(self): self.tasks = [] self.next_id = 1 def add_task(self, title: str, due_date: Optional[datetime] = None, tags: List[str] = None) -> Task: task = Task( id=self.next_id, title=title, completed=False, due_date=due_date, tags=tags or [] ) self.tasks.append(task) self.next_id += 1 return task def complete_task(self, task_id: int) -> bool: for task in self.tasks: if task.id == task_id: task.completed = True return True return False def get_overdue(self) -> List[Task]: now = datetime.now() return [t for t in self.tasks if t.due_date and t.due_date < now and not t.completed] ```

Winner
Claude Opus 4.6
openrouter
8.94
WINNER SCORE
matrix avg: 7.78
results.json report.mdFull dataset (CSV) →
10×10 Judgment Matrix · 77 judgments
OPEN DATA
Judge ↓ / Respondent →GPT-5.4Claude Opus 4.6Gemini 3.1 ProClaude Sonnet 4.6Grok 4.20DeepSeek V4GPT-OSS-120BGemini 3MiniMax M2.5MiMo-V2-Flash
GPT-5.47.50.56.57.06.34.86.24.25.2
Claude Opus 4.69.21.09.28.37.59.29.08.07.0
Gemini 3.1 Pro9.89.49.48.39.08.89.26.56.2
Claude Sonnet 4.68.89.31.08.88.08.88.89.28.0
Grok 4.208.68.42.5·6.8·7.87.57.0
DeepSeek V4·9.6·9.68.69.68.89.68.8
GPT-OSS-120B8.88.8·8.6·7.8·6.97.8
Gemini 39.89.89.8·9.89.69.8·9.8
MiniMax M2.57.4·1.28.47.56.5·7.57.6
MiMo-V2-Flash8.68.68.69.08.68.2·8.8·