Plans
Pay-as-you-go prepaid credits, the control-plane fee that applies on every execution path, the managed-optimization pilot, and BYOC and enterprise contracts.
Zumik is pay-as-you-go: you pre-load credits and they are drawn down as you use the API. There is no subscription and no free inference tier. You are billed for two things, processed input tokens and generated output tokens, net of any reuse — not a dozen infrastructure meters.
Prepaid credits
Inference is funded by a prepaid credit balance. Buy credits through Stripe in any amount from $5 up; the charge is immediate and the credits land on your account balance in real time. Unused credits stay on your balance.
Inference requires a positive credit balance. With an empty balance, every inference call returns 402 credits_required, so a precise "add credits" signal comes back rather than a generic error. Start a top-up from the console — you enter your card right on Zumik via the embedded Stripe Payment Element — or with POST /v2/billing/payment-intent (body { "amount_usd": 25 }), which returns a PaymentIntent client secret. Turn on auto-recharge to refill automatically from the saved card when your balance runs low.
By default inference pauses when you run out. You can opt into unlimited overage billed at standard pay-as-you-go rates; see credits and budgets for how caps, alerts, and overage compose.
What you pay
On the managed-provider path Zumik calls the provider for you and bills a clean per-model price — the provider's list price minus a small published discount — so your bill always lands under calling the provider directly. The caching, batch, and flex savings Zumik captures are its margin.
On bring-your-own-key and bring-your-own-cloud paths you pay the provider (or run the model) yourself, and Zumik bills only a control-plane fee — a small percentage of the request's provider list value — for the routing, state, reproducibility, and optimization the platform provides. The fee applies on every path.
| Profile | What you pay Zumik | Provider spend |
|---|---|---|
| Managed provider | Per-model published price (list minus the headline discount), net of reuse credit | Included in the metered rate |
| BYOK | A control-plane fee per request | Billed by the provider directly to your account |
| BYOC | A control-plane fee per request | Your own infrastructure costs |
Managed-optimization pilot
For teams that want hands-on tuning, the pilot is $5,000 to $20,000 per month plus usage. It includes dedicated sales engineering and SLA-backed optimization reviews: prompt-layout work, alias-policy evaluation, and replay-driven profile recommendations. Run a diagnostic first; the pilot only makes sense when the workload has measurable reuse to capture.
BYOC and enterprise
Bring your own cloud and enterprise are negotiated annual contracts billed through Stripe Invoicing, scoped by execution profile, support tier, and security requirements. Activate BYOC only where replay data shows the blended total cost is genuinely lower than the managed-provider bill at the same reliability.
The free workload scan does not require a payment method or credits. It is metered separately and has its own hard cap, so you can measure reuse opportunity before loading credits.
Plan comparison
Pay-as-you-go
Prepaid credits from $5 up. Per-model pricing under provider list on managed; a control-plane fee on BYOK/BYOC.
Pilot
$5k to $20k/month plus usage. Dedicated engineering and SLA-backed reviews.
Enterprise / BYOC
Negotiated annual contract. Private execution, custom retention, and security scope.
Next: credits and budgets
Hard caps, the 50/80/100% alert ladder, per-key budgets, and overage opt-in.
Vulnerability checklist
The vulnerability classes Zumik continuously verifies as absent across every service, deployment, and dependency, as a reference for security reviews, pen-test scope, and automated scanning.
Credits and budgets
Credits, project hard caps, 50/80/100% alert thresholds, per-API-key budgets, and opt-in overage.