The Multivac — Ask any model, routed by evaluation

◈ MULTIVAC
OverviewEvaluationsLeaderboardModel PulseHistoryCompareExportAPI
Routing APIExport APISign in
← Evaluations/EVAL-20260402-193645
analysis
Apr 02, 2026ANALYSIS-014
You're launching an AI API. Competitors charge $0.01-0.03/1K tokens. Your model is 20% better on benchmarks but 40% more expensive to run. (1) Should you price above, at, or below competitors? Analyze each strategy. (2) Design a pricing structure (free tier, usage-based, enterprise). (3) A customer processes 10M tokens/month. Calculate their cost under your pricing vs competitors. (4) At what volume does a dedicated instance become cheaper than per-token pricing?
Winner
MiniMax M2.5
openrouter
8.72
WINNER SCORE
matrix avg: 7.88
↓ results.json↓ report.mdFull dataset (CSV) →
10×10 Judgment Matrix · 89 judgments
OPEN DATA
Judge ↓ / Respondent →Gemini 3.1 ProClaude Opus 4.6GPT-5.4DeepSeek V4MiMo-V2-FlashClaude Sonnet 4.6Grok 4.20Gemini 3GPT-OSS-120BMiniMax M2.5
Gemini 3.1 Pro—6.47.97.08.87.56.57.2·8.9
Claude Opus 4.66.0—7.56.37.88.28.87.87.07.8
GPT-5.43.36.0—6.77.25.88.48.05.38.4
DeepSeek V48.69.48.8—9.09.29.39.09.09.0
MiMo-V2-Flash7.37.88.08.2—9.38.68.67.89.2
Claude Sonnet 4.66.38.88.27.38.6—8.67.87.38.6
Grok 4.206.58.78.87.08.48.4—7.88.68.8
Gemini 37.39.69.69.69.69.69.8—8.89.6
GPT-OSS-120B4.35.97.77.68.66.57.87.8—8.3
MiniMax M2.55.86.87.68.68.87.48.28.66.1—