Credits and budgets
Credits, project hard caps, 50/80/100% alert thresholds, per-API-key budgets, and opt-in overage.
Spend can never surprise you. A project has a credit balance, an optional hard cap, an alert ladder, and an explicit overage choice. Money is tracked internally in micros (1,000,000 micros = $1) so per-token charges never lose precision to float rounding.
Credits
Your credit balance is your prepaid pay-as-you-go top-ups. Each top-up adds to the balance in real time once Stripe confirms payment, and spend is drawn down per request. Read the live state with GET /v2/billing/account:
{
"object": "billing_account",
"plan": "base",
"subscription_status": "none",
"credit_balance_micros": 4200000,
"cycle_spend_micros": 12500000,
"monthly_budget_micros": 50000000,
"overage_mode": "pause"
}subscription_status is none, active, past_due, or canceled. It no longer gates inference — a positive credit_balance_micros does (it still reflects enterprise-contract state). See plans for how billing works.
Project hard caps
A project can set a hard monthly cap. In the default pause mode, once cycle spend reaches the cap, new inference returns 429 quota_exceeded. The error matches OpenAI's quota shape, so existing retry and backoff logic treats it identically.
curl -X POST https://api.zumik.ai/v2/billing/budget \
-H "Authorization: Bearer zk_live_..." \
-H "Content-Type: application/json" \
-d '{"monthly_budget_usd": 50.0}'Pass null to remove the cap. A negative value returns 400 with param monthly_budget_usd. Spendable funds in pause mode are the smaller of your credit balance and the remaining headroom under the cap.
Alert thresholds
Setting a cap arms a soft alert ladder at 50%, 80%, and 100% of the budget. Each threshold fires exactly once per cycle and is re-armed when a new cap is set or the cycle rolls over on a paid invoice.
| Threshold | Meaning |
|---|---|
| 50% | Half the monthly budget consumed |
| 80% | Approaching the cap |
| 100% | Cap reached (inference pauses unless overage is on) |
Alerts are driven by Stripe billing events, not polling, and are delivered by email and webhook to a customer-owned endpoint.
Per-API-key budgets
A single key can carry its own limit, independent of the project cap, so one team member's key cannot drain a shared budget. Set it with POST /v2/api-keys/{key_id}/budget:
curl -X POST https://api.zumik.ai/v2/api-keys/key_01jy.../budget \
-H "Authorization: Bearer zk_live_..." \
-H "Content-Type: application/json" \
-d '{"limit_usd": 25.0}'A key that exhausts its own limit returns 429 quota_exceeded even when the project cap still has room. Pass null to clear it. Per-key budgets are also covered in authentication.
Overage opt-in
By default inference pauses at the hard cap. To keep serving past it, billed at standard pay-as-you-go rates, opt in explicitly. confirm: true is required; enabling overage without it returns 400.
curl -X POST https://api.zumik.ai/v2/billing/overage \
-H "Authorization: Bearer zk_live_..." \
-H "Content-Type: application/json" \
-d '{"allow_overage": true, "confirm": true}'Overage opt-in lets spend continue past the budget; it never invents credits. The opt-in is recorded in the project audit log so the decision is provable.
See savings as a credit
Realized reuse becomes a visible reuse credit on the input side of your bill.
Plans
Pay-as-you-go prepaid credits, the control-plane fee that applies on every execution path, the managed-optimization pilot, and BYOC and enterprise contracts.
Reuse credit
How realized reuse becomes a visible credit on your bill, gross input charge minus reuse credit equals processed input charge.