edge cases
Feb 07, 2026EDGE-004Process these strings and describe any issues: 1. "HelloWorld" (contains zero-width space) 2. "naïve" vs "naïve" (different Unicode normalizations) 3. "🇺🇸" (flag emoji - actually two code points) 4. "olleh" (contains right-to-left override) 5. "a]o[r6}s{4(u2)1*v+ni" (looks normal but check character codes) 6. "<script>alert('xss')</script>" For each: What might go wrong if this string is used as (a) a filename, (b) a database key, (c) displayed in HTML?
Winner
MiMo-V2-Flash
Xiaomi
9.44
WINNER SCORE
matrix avg: 8.69
10×10 Judgment Matrix · 100 judgments
OPEN DATA
| Judge ↓ / Respondent → | Claude Opus 4.5 | Gemini 3 | Claude Sonnet 4.5 | GPT-5.2-Codex | GPT-OSS-120B | Gemini 3 | DeepSeek V3.2 | MiMo-V2-Flash | Grok 4.1 Fast | Grok 3 (Direct) |
|---|---|---|---|---|---|---|---|---|---|---|
| Claude Opus 4.5 | — | 5.3 | 9.3 | 7.8 | 8.6 | 9.6 | 9.0 | 9.6 | 9.2 | 9.2 |
| Gemini 3 | 0.0 | — | 0.0 | 6.8 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| Claude Sonnet 4.5 | 10.0 | 8.1 | — | 8.4 | 8.8 | 9.6 | 9.6 | 10.0 | 9.6 | 9.3 |
| GPT-5.2-Codex | 8.0 | 3.1 | 6.3 | — | 6.0 | 8.2 | 8.8 | 8.2 | 8.6 | 8.4 |
| GPT-OSS-120B | 7.8 | 4.5 | 0.0 | 0.0 | — | 0.0 | 8.8 | 0.0 | 8.8 | 9.2 |
| Gemini 3 | 9.8 | 7.1 | 9.6 | 8.1 | 9.0 | — | 9.8 | 9.8 | 9.8 | 9.2 |
| DeepSeek V3.2 | 10.0 | 8.4 | 9.3 | 6.0 | 8.8 | 10.0 | — | 9.3 | 9.3 | 9.8 |
| MiMo-V2-Flash | 9.6 | 7.8 | 8.8 | 7.8 | 9.0 | 9.0 | 9.0 | — | 8.6 | 9.0 |
| Grok 4.1 Fast | 10.0 | 7.2 | 9.8 | 7.8 | 8.8 | 10.0 | 9.6 | 10.0 | — | 9.6 |
| Grok 3 (Direct) | 9.3 | 7.3 | 8.8 | 8.0 | 8.8 | 8.8 | 8.8 | 9.2 | 8.8 | — |