Zumik
Examples

Agent runtime

A multi-turn agent loop on Zumik's native /v2 session and branch surface - an append-only transcript with optimistic concurrency and interactive QoS.

A multi-turn agent loop on Zumik's native, stateful /v2 surface. The branch is the agent's append-only transcript; the session carries any pinned base context. Each turn appends events with optimistic concurrency and runs a code.fast response under an interactive QoS request.

Source: examples/agent-runtime. It is fully native - one surface, no OpenAI client.

Run

export ZUMIK_API_KEY="zk_..."
pip install -r requirements.txt   # httpx==0.28.1
python runtime.py

The loop

Open a session and use its default branch

The session returns a default_branch_id. The runtime tracks the branch's optimistic-concurrency position in a small cursor (version and head event id).

r = client.post("/v2/sessions", headers=h, json={})
s = r.json()
cur = BranchCursor(session_id=s["id"], branch_id=s["default_branch_id"], version=0)

Append the user turn

Each turn first records a user_message event. The append carries expected_version and expected_head_event_id, so a stale writer is rejected rather than silently rewriting history.

r = client.post(
    f"/v2/sessions/{cur.session_id}/branches/{cur.branch_id}/events",
    headers=h,
    json={
        "expected_version": cur.version,
        "expected_head_event_id": cur.head_event_id,
        "event": {"event_type": "user_message"},
    },
)
if r.status_code == 409:
    raise RuntimeError(f"branch version conflict: {r.json()}")
evt = r.json()
cur.version, cur.head_event_id = evt["sequence"], evt["id"]

Run the response against the session and branch

Generation goes through /v2/responses with an interactive QoS request: a low TTFT target, a hard deadline, and a compatible fallback so a saturated primary degrades instead of failing the turn.

INTERACTIVE_QOS = {
    "class": "interactive",
    "target_ttft_ms": 500,
    "deadline_ms": 8000,
    "degrade_policy": "allow_compatible_fallback",
}
r = client.post("/v2/responses", headers=h, json={
    "model": "code.fast",
    "input": text,
    "session_id": cur.session_id,
    "branch_id": cur.branch_id,
    "qos": INTERACTIVE_QOS,
})
reply = r.json()["output_text"]

Append the assistant reply

Record an assistant_message event the same way, advancing the cursor again. The branch is now the full transcript of the turn.

Optimistic concurrency

Every append advances the branch's version and head_event_id. Because each request sends the expected_version it last saw (plus the expected_head_event_id), two writers racing on the same branch cannot clobber each other: the stale one gets a branch_version_conflict (HTTP 409) and can re-read and retry. This is what makes the append-only transcript safe under concurrency. See sessions and branching and QoS.

Notes

  • Event types are the native enum: user_message, assistant_message, tool_result, retrieval_result, checkpoint, note.
  • To attach a tool call's input or output to the transcript, store it as an artifact and pass the artifact id as the event's payload_ref.
  • To explore an alternative without disturbing the main line, fork a branch with POST /v2/sessions/{sid}/branches (fork_from_branch_id, optional fork_from_event_id).

On this page