Zumik
v2 · Native state

Responses

Create, retrieve, and cancel native responses. The /v2 surface pins session state and an explicit QoS request in the body, and reports the formal QoS outcome.

The native /v2/responses surface pins session state and a QoS request directly in the request body, rather than squeezing them through compatibility headers as /v1 does. The response carries the chosen execution profile and the formal QoS outcome inline. Response ids are prefixed rsp_. See QoS.

All requests require a bearer API key. Creating a response runs inference, so the project must have a positive prepaid credit balance and be under its budget. See authentication and billing and budgets.

Create a response

POST /v2/responses

modelstringrequired

The model alias or concrete target to run, e.g. code.fast. Resolved through an immutable alias release.

inputstringrequired

The input text. Must not be empty.

session_idstring

Optional ses_... to associate the response with a session. Must exist in this project.

branch_idstring

Optional br_... branch within the session. Must resolve through a session this project owns.

qosobject

Optional explicit QoS request. Defaults to the standard class with no hard targets and compatible fallback allowed.

qos
classstringrequired

One of interactive, standard, background, batch.

target_ttft_msinteger

Time-to-first-token target. The outcome reports target_met against it.

deadline_msinteger

Hard deadline. If the provider does not respond within this window the request aborts with deadline_exceeded (504) and is never charged.

priorityinteger

Scheduling priority hint, 0-255.

degrade_policystringrequired

One of forbid, allow_compatible_fallback.

curl https://api.zumik.ai/v2/responses \
  -H "Authorization: Bearer $ZUMIK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "code.fast",
    "input": "Summarize the diff in three bullets.",
    "session_id": "ses_01jy7n7w2z9pcu7a5q2gh8askm",
    "branch_id": "br_01jy7n7w30ardv8b6r3jk9btln",
    "qos": { "class": "interactive", "target_ttft_ms": 500, "deadline_ms": 8000, "degrade_policy": "allow_compatible_fallback" }
  }'
from zumik import Zumik

zk = Zumik()
resp = zk.responses.create(
    model="code.fast",
    input="Summarize the diff in three bullets.",
    session_id=session.id,
    branch_id=session.default_branch_id,
    qos={"class": "interactive", "target_ttft_ms": 500, "deadline_ms": 8000},
)
{
  "id": "rsp_01jy7ndf89g0h1j2k3l4m5n6op",
  "object": "response",
  "session_id": "ses_01jy7n7w2z9pcu7a5q2gh8askm",
  "branch_id": "br_01jy7n7w30ardv8b6r3jk9btln",
  "status": "completed",
  "model": "code.fast",
  "execution_profile": "managed_provider",
  "output_text": "- Adds a CAS guard to branch append\n- Cleans up the dead retry shim\n- Bumps the test fixtures",
  "qos_outcome": {
    "admission": "admitted",
    "completion": "completed",
    "target_met": true,
    "ttft_ms": 312,
    "latency_ms": 1840,
    "deadline_met": true,
    "degraded": false,
    "fallback_used": false,
    "reason_code": null
  }
}
idstring

Opaque response id, prefixed rsp_.

objectstring

Always response.

session_idstring

The associated session, or null.

branch_idstring

The associated branch, or null.

statusstring

completed on success; cancelled after a cancel.

modelstring

The requested model, echoed back.

execution_profilestring

Which profile served the request: managed_provider, byok, or subscription.

output_textstring

The generated text.

qos_outcomeobject

The formal QoS outcome.

qos_outcome
admissionstring

One of admitted, queued, rejected, expired_before_start.

completionstring

One of completed, failed, cancelled, expired_during_execution.

target_metboolean

Whether target_ttft_ms was met, or null when no target was set.

ttft_msinteger

Observed time to first token.

latency_msinteger

Observed total latency.

deadline_metboolean

Whether deadline_ms was met, or null when none was set.

degradedboolean

Whether the request was served in a degraded mode.

fallback_usedboolean

Whether a fallback profile served the request.

reason_codestring

A stable reason code when a target was missed, otherwise null. One of queue_saturation, provider_rate_limit, provider_timeout, region_unavailable, alias_no_compatible_target, cache_miss, cache_transfer_slower_than_recompute, fallback_profile_used, customer_deadline_too_short.

Retrieve a response

GET /v2/responses/{response_id}

response_idstringpathrequired

The rsp_... id to fetch.

curl https://api.zumik.ai/v2/responses/rsp_01jy7ndf89g0h1j2k3l4m5n6op \
  -H "Authorization: Bearer $ZUMIK_API_KEY"

Returns the stored response object.

Cancel a response

POST /v2/responses/{response_id}/cancel

Marks a stored response cancelled. Returns the updated object.

response_idstringpathrequired

The rsp_... id to cancel.

curl -X POST https://api.zumik.ai/v2/responses/rsp_01jy7ndf89g0h1j2k3l4m5n6op/cancel \
  -H "Authorization: Bearer $ZUMIK_API_KEY"

Errors

StatusCodeWhen
400invalid_request_errorinput is empty, or session_id / branch_id does not exist in this project.
401invalid_api_keyMissing or invalid API key.
402credits_requiredThe prepaid credit balance is empty.
403region_not_allowedThe resolved region is forbidden by this project's regional policy.
404invalid_request_errorThe response does not exist in this project.
429quota_exceeded / rate_limit_exceededBudget reached, or per-key rate limit exceeded.
504deadline_exceededThe QoS deadline_ms elapsed before the provider responded. The request is not charged.

See the full table on errors.

On this page