All scorecards
Xyntherium

Kimi

Benched

Sustained ~90% production failure rate (timeouts / empty bodies). Benched pending re-evaluation. · since 2026-06-10

Last updated: gathering data

Routed for

General reasoning· Documents & images· Live URL fetch· Deep research

Kimiis routed to questions that play to these strengths. Where a task needs a capability it doesn’t have, the question goes to the models that do — and Kimi sits that one out.

This scorecard is gathering data. The first numbers appear after the next daily refresh of live verification traffic.

Rebuilt daily from live verification traffic. Capability flags are set by hand; the performance numbers are earned.