Which Model For Which Task?
Stop guessing. Route tasks to the right model based on independent, continuous blind peer evaluation — not vendor benchmarks, not marketing claims.
Routing API — Free, No Auth Required
Integrate routing intelligence into your pipeline with a single API call. Returns ranked models by category with confidence scores derived from peer evaluation consensus.
Routing Saves Money
Not every task needs the most expensive model. A simple email draft doesn't need GPT-5 — and our data proves it. Route simple tasks to efficient models, complex tasks to frontier models. Cut API costs 30–50% without sacrificing quality where it matters.
Phase: Simplicity
Current evaluations use 5 criteria (correctness, completeness, clarity, depth, usefulness). Research suggests that multi-criteria scoring introduces noise — judges weight dimensions inconsistently, and the composite score obscures meaningful signal.
We are introducing a Simplicity phase: a parallel evaluation track using a single criterion — "Which response would you rather use?" — to reduce noise and increase signal clarity. Both tracks run simultaneously. We publish the correlation between multi-criteria and single-criterion rankings. If simplicity produces equivalent rankings with less noise, it becomes the default.