xAI (Grok)

xAI Grok through Zumik - context caching at a 75% read discount, live web-search grounding, Grok-3 and the cost-optimized Grok-3 Mini, no Batch tier, and when the broker routes here.

xAI Grok is the provider to reach for when a request needs live web-search grounding or a fast, cost-optimized frontier response. Grok-3 covers frontier reasoning and live-web-grounded tasks; Grok-3 Mini is the cost-sensitive routing target for high-volume work where full Grok-3 capability is not required. It is the only first-class provider here that reports live_search_supported.

It is available on both the managed-provider and BYOK profiles. Requests that resolve here report Agent-Resolved-Provider: xai.

Caching economics

xAI caches a stable context across consecutive requests, billed at a 75% read discount. Monitor the cache hit rate via the usage metadata, and keep the stable prefix consistent across calls so the cached context stays warm.

Fact	Value
Cache type	Explicit (cached context)
Minimum cacheable prefix	1,024 tokens
Cache-read discount	75%
Default TTL	300 seconds (5 min)
Extended TTL	3,600 seconds (1h)
Cached-token reporting	Yes
Manual cache clear	Not supported

No Batch tier

xAI has no Batch API (batch_api_supported is false). Route background and batch-class work to a provider that does have one - OpenAI, Anthropic, or Gemini - and keep xAI for the interactive and live-grounded traffic it is best at. xAI exposes a single standard service tier.

Live search

Grok supports real-time web-search grounding, which lets a request reach for up-to-date information without a separate retrieval pipeline. This is gated behind the live_search_supported manifest flag, so the broker only routes grounding-dependent work here.

When the broker routes here

Live-grounded reasoning

Tasks that need current web information inline, where a Perplexity-style grounding step would otherwise be required.

Cost-sensitive frontier

Grok-3 Mini for high-volume interactive work where a frontier model is needed but cost per token matters more than maximum capability.

Fast lightweight responses

Grok-3 Mini is competitive with commodity small frontier models on time-to-first-token.

Warm stable prefixes

Long stable contexts reused across consecutive calls take the 75% cached-context discount.

At a glance

Capability	Value
Context window	131,072 tokens
Multimodal input	Yes
Live search	Yes
Batch API	No
Dedicated deployment	No
Service tiers	standard
Data retention	standard
Regions	us

Manifest revision cap_2026_06_09. The capability manifest is what tells the broker xAI is the only first-class option for live-grounded work and that background work must route elsewhere.