← Evaluations/EVAL-20260207-142101
code
Feb 24, 2026CODE-007

Explain what this code does in plain English. Then identify any bugs or design issues. ```python def f(x, n=3, m=None): m = m or {} if n == 0: return [[]] if x in m: return m[x] r = [] for i in range(len(x)): for p in f(x[:i] + x[i+1:], n-1, m): r.append([x[i]] + p) m[x] = r return r def g(s, k): from collections import Counter c = Counter(s) h = [] import heapq for ch, cnt in c.items(): heapq.heappush(h, (-cnt, ch)) r = [] while h and len(r) < k: cnt, ch = heapq.heappop(h) r.append(ch) return ''.join(r) ```

Winner
GLM-4-7
Zhipu
9.45
WINNER SCORE
matrix avg: 8.34
results.json report.mdFull dataset (CSV) →
10×10 Judgment Matrix · 100 judgments
OPEN DATA
Judge ↓ / Respondent →Grok Code FastGemini 3GPT-5.2-CodexDeepSeek V3.2Claude Opus 4.5Gemini 3Claude Sonnet 4.5MiniMax M2GLM-4-7Grok 3 (Direct)
Grok Code Fast9.29.29.69.49.88.92.09.89.4
Gemini 30.00.00.00.00.00.00.00.00.0
GPT-5.2-Codex8.07.07.87.86.25.30.00.05.5
DeepSeek V3.29.18.89.68.79.69.27.59.69.2
Claude Opus 4.59.08.89.29.29.27.81.49.27.2
Gemini 39.69.69.89.610.08.60.09.88.3
Claude Sonnet 4.59.39.39.39.08.89.35.59.39.3
MiniMax M29.26.28.68.88.69.08.610.07.3
GLM-4-70.09.39.80.00.09.20.00.00.0
Grok 3 (Direct)8.78.48.38.78.48.88.45.98.4