Zumik

Quickstart

Point an OpenAI client at Zumik, create a reusable artifact, and run a workload diagnostic in five minutes.

You need an API key and a prepaid credit balance. Create a key in the console, then add credits (from $5). Inference is blocked until your balance is funded, so do this first. See authentication for key formats and pricing for how billing works.

Point an OpenAI client at Zumik

The only line that changes is the base URL. Request and response bodies stay byte-for-byte OpenAI-compatible.

Python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.zumik.ai/v1",
    api_key="zk_live_...",
)

r = client.responses.create(
    model="code.balanced",   # a Zumik alias, resolved at request time
    input="Review the latest patch.",
)
print(r.output_text)

Tip

Keep stable content (system instructions, tools, context) at the front of the request so provider prompt caching can match the prefix. See prompt caching.

Create a reusable artifact on /v2

Turn stable instructions, tool definitions, or policies into an opaque, immutable handle instead of resending them on every call. The handle is the artifact ID; the content lives behind the tenant boundary.

curl https://api.zumik.ai/v2/artifacts \
  -H "Authorization: Bearer zk_live_..." \
  -H "Content-Type: application/json" \
  -d '{"artifact_type":"policy","content":"Run the linter before every commit."}'
# => { "id": "art_01JY...", "object": "artifact", "artifact_type": "policy" }

Group artifacts into an ordered bundle, then attach the bundle to a session to carry stable state across turns.

Run a workload diagnostic

Before you change any infrastructure, score how much of your traffic is reusable. The diagnostic reads metadata traces and returns a Workload Reuse Score, the reuse waterfall, a recommended execution profile, and the missed-opportunity gap.

curl https://api.zumik.ai/v2/diagnostics \
  -H "Authorization: Bearer zk_live_..." \
  -H "Content-Type: application/json" \
  -d '{"source":"trace_export","trace_mode":"metadata","sample_ref":"trc_..."}'

The diagnostic runs on metadata only by default, so no raw prompt text is retained. See workload diagnostics for how to read the report.

Where to go next

On this page