All scorecards
Xyntherium
Kimi
BenchedSustained ~90% production failure rate (timeouts / empty bodies). Benched pending re-evaluation. · since 2026-06-10
Last updated: gathering data
Routed for
✓ General reasoning· Documents & images· Live URL fetch· Deep research
Kimiis routed to questions that play to these strengths. Where a task needs a capability it doesn’t have, the question goes to the models that do — and Kimi sits that one out.
This scorecard is gathering data. The first numbers appear after the next daily refresh of live verification traffic.
Rebuilt daily from live verification traffic. Capability flags are set by hand; the performance numbers are earned.
