OpenAI compatibility
The strict /v1 compatibility contract, exact request and response shapes, no proprietary JSON fields, and proprietary signal carried on Agent-* headers.
The purpose of /v1 is migration with minimal code changes. Point your existing OpenAI client at https://api.zumik.ai/v1 and it works unchanged.
The contract
/v1 mirrors OpenAI request and response shapes exactly. Zumik does not add proprietary JSON fields, does not rename fields, does not invent new response events, and does not overload upstream fields with internal meaning.
A client that sends a plain OpenAI request and reads a plain OpenAI response sees nothing Zumik-specific in the body. That is the point: a vanilla OpenAI SDK stays correct.
When a backend cannot implement a meaningful upstream field, the request is rejected with a compatible error or routed to a backend that can serve it. Zumik does not silently ignore fields that change the result.
Covered endpoints
These /v1 endpoints are conformance-tested and live today.
| Method | Path | Notes |
|---|---|---|
POST | /v1/responses | Primary generation endpoint |
GET | /v1/responses/{response_id} | Retrieve a response |
POST | /v1/responses/{response_id}/cancel | Cancel an in-flight response |
POST | /v1/responses/input_tokens | Count input tokens |
POST | /v1/chat/completions | Chat Completions compatibility |
POST | /v1/embeddings | Embeddings |
GET | /v1/models | List available models |
GET | /v1/models/{model_id} | Retrieve one model |
Conversations, batches, and files compatibility expand the surface as the conformance matrix grows. See the API reference for the full per-endpoint schemas.
/v1/models returns the small upstream-compatible model object (id, object, created, owned_by). Richer capability metadata stays out of /v1 and lives under /v2 model aliases, so the list response shape matches OpenAI's.
Models can be aliases
The model field accepts a concrete model ID or a Zumik alias such as code.fast or auto.balanced. An alias resolves at request time to an immutable release, so routing is reproducible. The body shape is identical either way.
Proprietary signal rides on headers
Anything Zumik-specific travels through optional HTTP headers, never the JSON body. A request that omits every Agent-* header still gets valid /v1 behavior.
Request headers
| Header | Purpose |
|---|---|
Agent-Hints-Ref: ah_... | Reference a stored agent-hints contract |
Agent-Session-Ref: ses_... | Attach the request to a session |
Agent-Trace-Mode: metadata | tokenized | encrypted_full_fidelity | Choose the trace privacy mode |
Agent-Idempotency-Key: ... | Make a mutating request safe to retry |
Response headers
Proprietary outcomes come back on headers so the body stays OpenAI-shaped.
Agent-Execution-Profile: managed_provider
Agent-QoS-Admission: admitted
Agent-QoS-Target-Met: true
Agent-QoS-Fallback-Used: false
Agent-Trace-Id: trc_...Where reuse savings apply, the discounted input fraction is also reflected in the standard usage.prompt_tokens_details.cached_tokens field, so reuse stays measurable with a vanilla OpenAI SDK and nothing proprietary in the body.
Do not expect Zumik objects such as qos_outcome inside an OpenAI-compatible response. The full outcome object lives on /v2 and the telemetry API. Read the compact subset from the response headers above.
When to reach for /v2
Use /v2 when you want explicit state, branching, replay, signed purge receipts, or the full QoS outcome object as first-class JSON. The two surfaces interoperate, so you can migrate the body to /v1 first and adopt /v2 state where it pays off.