Zumik
Providers

Anthropic

Anthropic through Zumik - the deepest cache-read discount of any managed provider at 90%, explicit cache_control breakpoints, the Message Batches API at 50% off, zero-retention availability, and when the broker routes here.

Anthropic carries the single highest-leverage economics of the five first-class providers: a 90% cache-read discount, the deepest of any managed provider. For agent workloads where the system prompt, tool definitions, and knowledge base are stable across hundreds of requests per session, Anthropic frequently delivers the lowest effective input-token cost of any frontier provider - and once cache reads dominate, sometimes below open-source inference.

It is available on both the managed-provider and BYOK profiles. Requests that resolve here report Agent-Resolved-Provider: anthropic.

Caching economics

Anthropic caching is explicit: you place cache_control breakpoints at the end of each stable block (system prompt, tool definitions, long documents). A cache write costs more than a normal token, so mark only blocks you will actually reuse, and never mark volatile content.

FactValue
Cache typeExplicit (cache_control breakpoints)
Minimum cacheable block1,024 tokens
Cache-read discount90% on cache reads
Default TTL300 seconds (5 min)
Extended TTL3,600 seconds (1h) on Claude 3.5+
Cached-token reportingYes
Manual cache clearNot supported

Tip

Measure cache_creation_input_tokens against cache_read_input_tokens per session to verify your capture rate. The 90% discount only compounds if reads dominate writes - which is exactly what a stable prefix across many turns produces.

Batch

The Message Batches API delivers a 50% cost reduction at up to 24h turnaround. Use it for the same non-interactive classes as everywhere else: background evaluations, diagnostic reprocessing, pre-computable tool calls, replay experiments. Anthropic exposes a single standard service tier; there is no flex/scale split to match to QoS.

Retention

Anthropic's data_retention_class is zero_retention_available. That makes it a strong fit for BYOK when an account-level zero-retention policy has to apply to the inference calls, and it factors into the purge guarantee the platform can offer for Anthropic-cached state. See retention and purge.

When the broker routes here

Repeated long-context prompts

Route here when stable prefixes exceed ~1,024 tokens and recurrence is high - the 90% cache-read discount compounds faster than any other provider mechanism.

Stable agent context

A fixed system prompt, tool registry, and knowledge base reused across hundreds of requests is the canonical Anthropic win.

Zero-retention requirements

Workloads that need zero-retention available at the account level, typically under BYOK.

Reasoning-heavy tasks

Extended thinking for reasoning work; gate the added latency behind the right alias class rather than enabling it everywhere.

At a glance

CapabilityValue
Context window200,000 tokens
Multimodal inputYes
Live searchNo
Dedicated deploymentNo
Service tiersstandard
Data retentionzero_retention_available
Regionsus

Manifest revision cap_2026_06_09. The capability manifest records that Anthropic holds the deepest cache discount in the set; the broker uses these facts to decide where a repeated long-context prompt is cheapest.

On this page