Zumik
Pricing

Plans

Pay-as-you-go prepaid credits, the control-plane fee that applies on every execution path, the managed-optimization pilot, and BYOC and enterprise contracts.

Zumik is pay-as-you-go: you pre-load credits and they are drawn down as you use the API. There is no subscription and no free inference tier. You are billed for two things, processed input tokens and generated output tokens, net of any reuse — not a dozen infrastructure meters.

Prepaid credits

Inference is funded by a prepaid credit balance. Buy credits through Stripe in any amount from $5 up; the charge is immediate and the credits land on your account balance in real time. Unused credits stay on your balance.

Inference requires a positive credit balance. With an empty balance, every inference call returns 402 credits_required, so a precise "add credits" signal comes back rather than a generic error. Start a top-up from the console — you enter your card right on Zumik via the embedded Stripe Payment Element — or with POST /v2/billing/payment-intent (body { "amount_usd": 25 }), which returns a PaymentIntent client secret. Turn on auto-recharge to refill automatically from the saved card when your balance runs low.

By default inference pauses when you run out. You can opt into unlimited overage billed at standard pay-as-you-go rates; see credits and budgets for how caps, alerts, and overage compose.

What you pay

On the managed-provider path Zumik calls the provider for you and bills a clean per-model price — the provider's list price minus a small published discount — so your bill always lands under calling the provider directly. The caching, batch, and flex savings Zumik captures are its margin.

On bring-your-own-key and bring-your-own-cloud paths you pay the provider (or run the model) yourself, and Zumik bills only a control-plane fee — a small percentage of the request's provider list value — for the routing, state, reproducibility, and optimization the platform provides. The fee applies on every path.

ProfileWhat you pay ZumikProvider spend
Managed providerPer-model published price (list minus the headline discount), net of reuse creditIncluded in the metered rate
BYOKA control-plane fee per requestBilled by the provider directly to your account
BYOCA control-plane fee per requestYour own infrastructure costs

Managed-optimization pilot

For teams that want hands-on tuning, the pilot is $5,000 to $20,000 per month plus usage. It includes dedicated sales engineering and SLA-backed optimization reviews: prompt-layout work, alias-policy evaluation, and replay-driven profile recommendations. Run a diagnostic first; the pilot only makes sense when the workload has measurable reuse to capture.

BYOC and enterprise

Bring your own cloud and enterprise are negotiated annual contracts billed through Stripe Invoicing, scoped by execution profile, support tier, and security requirements. Activate BYOC only where replay data shows the blended total cost is genuinely lower than the managed-provider bill at the same reliability.

The free workload scan does not require a payment method or credits. It is metered separately and has its own hard cap, so you can measure reuse opportunity before loading credits.

Plan comparison

Pay-as-you-go

Prepaid credits from $5 up. Per-model pricing under provider list on managed; a control-plane fee on BYOK/BYOC.

Pilot

$5k to $20k/month plus usage. Dedicated engineering and SLA-backed reviews.

Enterprise / BYOC

Negotiated annual contract. Private execution, custom retention, and security scope.

Next: credits and budgets

Hard caps, the 50/80/100% alert ladder, per-key budgets, and overage opt-in.

On this page