Execution profiles
Managed-provider default, BYOK, BYOC, hybrid, and OpenRouter emergency fallback - what each is for and who owns the control plane inside it.
An execution profile decides where a request actually runs. Zumik is provider-first: the default profile uses Zumik's contracted managed providers, and the other profiles are escalations you add only when evidence justifies them. Every profile runs over the same internal execution system, so your handles, aliases, and state objects are identical regardless of which profile serves a given request.
The profiles
Managed provider (default)
Zumik's own provider accounts. Fastest onboarding, broad model coverage, provider-native caching, automatic failover.
BYOK
Your provider credentials. You keep the billing relationship and your account-level retention rules; Zumik still drives routing, state, and telemetry.
BYOC
Your cloud, your runtime. For dedicated SLOs, private networking, stronger purge evidence, and runtime-confirmed reuse - activated only when replay proves it pays off.
Hybrid
Managed providers for breadth and overflow, BYOC hot lanes for concentrated, reusable workloads. The common shape for a maturing coding-agent platform.
Managed provider: the default
This is where everyone starts.
Client → Tier 1 gateway → Product API Core → Execution Broker
→ Managed provider adapters → Company-managed provider accountsIt gives you broad coverage, provider-native prompt caching (OpenAI automatic, Anthropic explicit cache-control, Gemini implicit, xAI context caching), provider Batch API lanes for cost reduction, multi-provider policy, and automatic failover - with no infrastructure to operate.
BYOK: your keys
Client → Product API Core → Execution Broker → Provider adapter → Your provider credentialBYOK is for customers with existing provider agreements, their own rate limits and quota reservations, procurement constraints, or account-level retention policies they need to keep. Provider-native cost and speed optimizations stay active under your key. The subscription credential flow (Claude Code, ChatGPT Codex) is part of this family.
BYOC: your cloud
Global product control plane → Customer-cloud data plane
→ One selected inference scheduler → Self-hosted runtime lane → KV hierarchyBYOC is the heaviest profile and the last resort, not the first. It is justified by dedicated latency SLOs, sustained hot-model volume, private networking, regional isolation, custom models, explicit KV-cache orchestration, or stronger purge evidence - and only where replay data shows a material benefit over the managed bill at the same reliability.
A high Workload Reuse Score does not justify BYOC on its own. The most common result of replay analysis is that provider-native caching already captures the reuse, and BYOC has no business case. Length alone never recommends self-hosting.
OpenRouter: emergency fallback only
OpenRouter is a last-resort continuity layer, used only when a primary provider has a verified total outage for a required model path.
OpenRouter is never used for routine routing or price arbitration. It is gated behind explicit policy checks for retention, region, procurement, and customer allowlists, disabled entirely for customers whose data-boundary or compliance requirements are incompatible with brokered execution, and every use is recorded with the failed upstream path and the policy that allowed the exception. A clear degraded response or rejection is always preferred over silently violating policy.
Control-plane ownership
Each profile assigns one owner per responsibility, so nothing is double-owned.
| Responsibility | Managed | BYOK | BYOC |
|---|---|---|---|
| Auth, quotas, project policy | Product API Core | Product API Core | Product API Core |
| Provider / profile selection | Execution Broker | Execution Broker | Execution Broker |
| Provider credential | Company | Customer | n/a |
| Model selection | Alias Resolver | Alias Resolver | Alias Resolver |
| Cache routing | Provider | Provider | Selected scheduler + runtime |
| Session state | State Service | State Service | State Service |
| Billing | Platform | Customer-provider + platform fee | Platform fee + customer infra |
| Purge guarantees | Bounded by provider capability | Bounded by provider capability | BYOC operator + State Service |
Inside one BYOC profile, exactly one component owns replica selection. The platform does not run competing replica schedulers in the same path. You pick the scheduler-owned profile up front; you do not stack two.
Quality of service
QoS classes, the request you submit, and the formal outcome object that makes the platform accountable for whether latency and reliability targets were actually met.
Capability manifests
Versioned, per-provider records of exactly what each provider and profile supports - the source of truth that routing decisions and purge claims both read.