← Evaluations/EVAL-20260207-140157
code
Jan 27, 2026CODE-003

Review this Flask API endpoint for security vulnerabilities. Identify ALL security issues and explain the fix for each. ```python from flask import Flask, request, jsonify import sqlite3 import pickle import os app = Flask(__name__) @app.route('/api/user/<user_id>') def get_user(user_id): conn = sqlite3.connect('users.db') cursor = conn.cursor() query = f"SELECT * FROM users WHERE id = {user_id}" cursor.execute(query) user = cursor.fetchone() return jsonify({"user": user}) @app.route('/api/upload', methods=['POST']) def upload_file(): file = request.files['file'] filename = file.filename file.save(os.path.join('/uploads', filename)) return jsonify({"status": "uploaded", "path": f"/uploads/{filename}"}) @app.route('/api/settings', methods=['POST']) def update_settings(): data = pickle.loads(request.data) # Process settings... return jsonify({"status": "updated"}) @app.route('/api/redirect') def redirect_user(): url = request.args.get('url') return f'<meta http-equiv="refresh" content="0;url={url}">' ```

Winner
GPT-5.2-Codex
OpenAI
9.77
WINNER SCORE
matrix avg: 8.74
results.json report.mdFull dataset (CSV) →
10×10 Judgment Matrix · 100 judgments
OPEN DATA
Judge ↓ / Respondent →GPT-5.2-CodexGrok Code FastGemini 3Claude Opus 4.5Claude Sonnet 4.5Gemini 3MiniMax M2GLM-4-7DeepSeek V3.2Grok 3 (Direct)
GPT-5.2-Codex8.88.68.68.85.92.50.08.36.5
Grok Code Fast9.810.010.09.88.06.41.610.010.0
Gemini 39.89.810.09.89.38.30.09.89.6
Claude Opus 4.59.810.09.69.67.95.10.59.88.8
Claude Sonnet 4.59.810.09.810.09.27.89.89.810.0
Gemini 30.00.010.00.00.02.40.00.00.0
MiniMax M210.010.010.00.010.07.29.69.88.7
GLM-4-79.69.810.010.010.06.83.310.08.8
DeepSeek V3.29.89.69.810.010.09.68.69.29.8
Grok 3 (Direct)9.69.69.49.69.48.66.88.09.6