Zumik
Core concepts

API surfaces

Strict OpenAI compatibility on /v1, native state on /v2, and the one rule that keeps them apart - proprietary behavior travels through headers or /v2, never by changing upstream JSON.

Zumik exposes two public APIs over one internal execution system. They serve different goals and they are kept rigorously separate so that neither compromises the other.

/v1: strict compatibility

The purpose of /v1 is migration with minimal code changes. You point an OpenAI client at https://api.zumik.ai/v1, change nothing else, and it works.

Python
from openai import OpenAI

client = OpenAI(base_url="https://api.zumik.ai/v1", api_key="zk_live_...")
r = client.responses.create(model="code.balanced", input="Review the latest patch.")

The contract rule is absolute: mirror OpenAI request and response shapes exactly.

  • Do not add proprietary JSON fields to request or response objects.
  • Do not rename upstream fields or overload them with internal meaning.
  • Do not invent /v1 streaming events.

When a backend cannot implement an upstream feature correctly, the request is rejected with a compatible error or routed to a backend that can. Meaningful fields are never silently ignored.

/v1 deliberately keeps surfaces small where OpenAI keeps them small. For example, /v1/models returns the exact upstream object shape (id, object, created, owned_by). Rich alias metadata lives on /v2/model-aliases, not here.

/v2: native state

The purpose of /v2 is everything the OpenAI shape has no room for: explicit state objects, branch control, replay, purge receipts, QoS outcomes, and rich telemetry. These are first-class fields, not headers.

curl https://api.zumik.ai/v2/artifacts \
  -H "Authorization: Bearer zk_live_..." \
  -H "Content-Type: application/json" \
  -d '{"artifact_type":"policy","content":"Run the linter before every commit."}'

/v2 is REST-style and organized around the state object model: artifacts, bundles, sessions and their branches, snapshots, responses, plus diagnostics, replay runs, purge jobs, and model aliases.

The extension rule

This is the rule that lets both surfaces coexist without contaminating each other.

Proprietary behavior is transported through optional HTTP headers or /v2, never by changing upstream JSON shapes.

A client that omits every Zumik header must still receive valid OpenAI behavior on /v1. The optional request headers are:

HeaderCarries
Agent-Hints-RefA reference to a stored Agent Hints object (ah_...)
Agent-Session-RefA session reference (ses_...)
Agent-Trace-Modemetadata, tokenized, or encrypted_full_fidelity
Agent-Idempotency-KeyAn idempotency key for safe retries

Proprietary signals flow back the same way. QoS results, for instance, ride on response headers on /v1 so the body stays OpenAI-shaped, while the full QoS outcome object is a JSON field on /v2.

Agent-QoS-Admission: admitted
Agent-QoS-Target-Met: true
Agent-QoS-Fallback-Used: false
Agent-Trace-Id: trc_...

Reuse capture is exposed through the standard usage.prompt_tokens_details.cached_tokens field plus headers. A vanilla OpenAI SDK can read it without knowing Zumik exists.

The full reuse report, QoS outcome, alias resolution record, and trace ID are explicit response fields. Nothing is hidden in headers because there is no compatibility constraint to honor.

Choosing a surface

You do not have to pick one. A common pattern is to keep request bodies on /v1 for SDK compatibility while attaching state and hints by reference through headers, then read rich telemetry from the /v2 usage and diagnostics endpoints.

On this page