Quickstart
Point an OpenAI client at Zumik, create a reusable artifact, and run a workload diagnostic in five minutes.
You need an API key and a prepaid credit balance. Create a key in the console, then add credits (from $5). Inference is blocked until your balance is funded, so do this first. See authentication for key formats and pricing for how billing works.
Point an OpenAI client at Zumik
The only line that changes is the base URL. Request and response bodies stay byte-for-byte OpenAI-compatible.
from openai import OpenAI
client = OpenAI(
base_url="https://api.zumik.ai/v1",
api_key="zk_live_...",
)
r = client.responses.create(
model="code.balanced", # a Zumik alias, resolved at request time
input="Review the latest patch.",
)
print(r.output_text)Tip
Keep stable content (system instructions, tools, context) at the front of the request so provider prompt caching can match the prefix. See prompt caching.
Create a reusable artifact on /v2
Turn stable instructions, tool definitions, or policies into an opaque, immutable handle instead of resending them on every call. The handle is the artifact ID; the content lives behind the tenant boundary.
curl https://api.zumik.ai/v2/artifacts \
-H "Authorization: Bearer zk_live_..." \
-H "Content-Type: application/json" \
-d '{"artifact_type":"policy","content":"Run the linter before every commit."}'
# => { "id": "art_01JY...", "object": "artifact", "artifact_type": "policy" }Group artifacts into an ordered bundle, then attach the bundle to a session to carry stable state across turns.
Run a workload diagnostic
Before you change any infrastructure, score how much of your traffic is reusable. The diagnostic reads metadata traces and returns a Workload Reuse Score, the reuse waterfall, a recommended execution profile, and the missed-opportunity gap.
curl https://api.zumik.ai/v2/diagnostics \
-H "Authorization: Bearer zk_live_..." \
-H "Content-Type: application/json" \
-d '{"source":"trace_export","trace_mode":"metadata","sample_ref":"trc_..."}'The diagnostic runs on metadata only by default, so no raw prompt text is retained. See workload diagnostics for how to read the report.
Where to go next
OpenAI compatibility
The exact compatibility contract and the header-based extension rules.
Sessions and branching
Append-only branches with optimistic concurrency for multi-turn agents.
Authentication
Key formats, bearer auth, rotation, and per-key budgets.
Pricing and credits
What the $5 plan includes and how pay-as-you-go credits work.