Product · Inference gateway
BTL Runtime
One API in front of every model provider. Lower effective AI spend and lower latency — without rewriting your app.
For teams shipping across OpenAI, Anthropic, Bedrock, Vertex, OpenRouter, and the long tail. BTL Runtime is the drop-in gateway that keeps working when provider economics change underneath you.
from openai import OpenAI
client = OpenAI(
base_url="https://api.badtheorylabs.com/v1",
api_key=BTL_KEY,
)
# same call. same shape. less spend.
client.chat.completions.create(
model="btl-frontier",
messages=[{"role": "user",
"content": "ship it"}],
)How it cuts spend
A token-efficiency layer,
not just a router.
Routing to a cheaper equivalent upstream is only half of it. The runtime also sends fewer billable tokens, reuses the ones it must send, and avoids doing the same work twice. You keep the model boundary you chose; we cut the waste before the request reaches it.
What teams get
Switch the base URL,
keep the product.
Best fit for teams already shipping AI products and feeling real spend or latency pressure. No exact-vendor lock-in — ask for a specific provider when you need it, let the gateway choose when you don't.
Request access →Customer API surface
The routes that
actually matter.
Most traffic only ever touches two of these. The rest are for keys, usage, and the catalog. No /v1/admin/* or ops-only auth in the customer path.
Stop paying for token waste.
Tell us your stack, providers, traffic, and constraints. We'll get you a key.