What is an LLM gateway?

An LLM gateway is a single API endpoint that sits in front of many model providers and routes your request to whichever model you ask for. Instead of holding a separate key, base URL, and SDK quirk for each lab, you hold one key and one OpenAI-compatible endpoint, and the gateway handles the rest. It is the same idea as an API gateway in classic backend work, applied to language models. Here is what that buys you.

The plain definition

A gateway exposes one OpenAI-compatible interface, usually /v1/chat/completions, and maps the model field in your request to a real provider behind the scenes. You send the same request shape every time; the gateway picks the upstream, attaches the right provider credentials, translates any quirks, and streams the response back. Your code does not change when you switch models, because the contract you code against stays constant.

Why it helps

Three wins. One integration: write against one endpoint and you can use every model the gateway carries, with no per-provider client. One bill: usage across all providers lands on a single balance instead of a dozen separate invoices. And easy switching: changing a model is a one-line edit, so you can chase the best price or quality per task without re-plumbing your app. For most builders the time saved on integration alone is the whole reason.

How it works under the hood

When a request arrives, the gateway reads the model name, looks up the matching upstream provider, swaps in that provider's credentials, rewrites any provider-specific body fields, and forwards the call. The streamed tokens come back through the same connection, so from your side it looks like one normal OpenAI call. Good gateways add retries on transient errors, usage and cost accounting, and a current model catalog so new releases appear without you touching anything.

Who actually needs one

You want a gateway if you use more than one model, plan to switch models as prices and quality move, or build anything that should not be hard-wired to a single lab. Coding agents, chat apps, roleplay front ends, and internal tools all benefit. If you genuinely only ever call one model from one provider and never expect to change, a direct provider key is simpler. Everyone else saves real effort with a gateway.

In short

An LLM gateway turns many providers into one endpoint, one key, and one bill, so you integrate once and switch models freely. UnoRouter is an OpenAI-compatible gateway in exactly this mold: one key reaches 200+ models for code and chat alike, with pay-as-you-go credits that do not expire. If you touch more than one model, a gateway is the cleaner foundation.

Try a gateway yourself: create a free account or browse the models.