All scorecards
Xyntherium

Grok

Active

Last updated: Jun 10, 2026

Routed for

General reasoningDocuments & imagesLive URL fetchDeep research

Grokis routed to questions that play to these strengths. Where a task needs a capability it doesn’t have, the question goes to the models that do — and Grok sits that one out.

Across all tasks

Metric7d30dAll-time
Response rate100%100%100%
p50 latency8.4s8.4s8.4s
p95 latency8.7s8.7s8.7s
Avg cost / query
Agreement w/ verdict64%64%64%
Consensus flip rate0%0%0%

Routed on 2of the last 30 days’ queries it was eligible for, answering 2.

By task type · 30-day

Score = router weight
TaskRespondedp95AgreementFlipScore
General100%8.7s64%0%

The score blends agreement with the verified verdict, response rate, and speed over the last 30 days. When a task has more capable models than a panel needs, the router prefers the higher scores — a soft preference, never a hard exclusion.

Rebuilt daily from live verification traffic. Capability flags are set by hand; the performance numbers are earned.