Zumik
Providers

xAI (Grok)

xAI Grok through Zumik - context caching at a 75% read discount, live web-search grounding, Grok-3 and the cost-optimized Grok-3 Mini, no Batch tier, and when the broker routes here.

xAI Grok is the provider to reach for when a request needs live web-search grounding or a fast, cost-optimized frontier response. Grok-3 covers frontier reasoning and live-web-grounded tasks; Grok-3 Mini is the cost-sensitive routing target for high-volume work where full Grok-3 capability is not required. It is the only first-class provider here that reports live_search_supported.

It is available on both the managed-provider and BYOK profiles. Requests that resolve here report Agent-Resolved-Provider: xai.

Caching economics

xAI caches a stable context across consecutive requests, billed at a 75% read discount. Monitor the cache hit rate via the usage metadata, and keep the stable prefix consistent across calls so the cached context stays warm.

FactValue
Cache typeExplicit (cached context)
Minimum cacheable prefix1,024 tokens
Cache-read discount75%
Default TTL300 seconds (5 min)
Extended TTL3,600 seconds (1h)
Cached-token reportingYes
Manual cache clearNot supported

No Batch tier

xAI has no Batch API (batch_api_supported is false). Route background and batch-class work to a provider that does have one - OpenAI, Anthropic, or Gemini - and keep xAI for the interactive and live-grounded traffic it is best at. xAI exposes a single standard service tier.

Grok supports real-time web-search grounding, which lets a request reach for up-to-date information without a separate retrieval pipeline. This is gated behind the live_search_supported manifest flag, so the broker only routes grounding-dependent work here.

When the broker routes here

Live-grounded reasoning

Tasks that need current web information inline, where a Perplexity-style grounding step would otherwise be required.

Cost-sensitive frontier

Grok-3 Mini for high-volume interactive work where a frontier model is needed but cost per token matters more than maximum capability.

Fast lightweight responses

Grok-3 Mini is competitive with commodity small frontier models on time-to-first-token.

Warm stable prefixes

Long stable contexts reused across consecutive calls take the 75% cached-context discount.

At a glance

CapabilityValue
Context window131,072 tokens
Multimodal inputYes
Live searchYes
Batch APINo
Dedicated deploymentNo
Service tiersstandard
Data retentionstandard
Regionsus

Manifest revision cap_2026_06_09. The capability manifest is what tells the broker xAI is the only first-class option for live-grounded work and that background work must route elsewhere.

On this page