Introduction

High-performance LLM API gateway in Rust — one API format, many providers, sub-millisecond overhead.

CrabLLM is a high-performance LLM API gateway written in Rust. It sits between your application and LLM providers, exposing an OpenAI-compatible API surface.

One API format. Many providers. Low overhead.

What it does

You send requests in OpenAI format to CrabLLM. It routes them to the configured provider — OpenAI, Anthropic, Google Gemini, Azure OpenAI, AWS Bedrock, or Ollama — translating the request and response as needed.

Your application talks to one endpoint. CrabLLM handles the rest:

Provider translation — Anthropic, Google, and Bedrock have their own API formats. CrabLLM translates automatically.
Routing — Weighted random selection across multiple providers for the same model. Automatic fallback when a provider fails.
Streaming — SSE streaming proxied without buffering.
Auth — Virtual API keys with per-key model access control.
Extensions — Rate limiting, caching, cost tracking, budget enforcement.
Management — Dynamic provider and key management via admin API and crabctl CLI. OpenAPI docs at /docs.

Why Rust

Sub-millisecond overhead — no GC pauses, no interpreter startup.
Memory safety — without runtime cost.
Concurrency — Tokio async runtime handles thousands of concurrent streaming connections efficiently.
Deployment — single static binary. No interpreter, no virtualenv, no Docker required.

Feature comparison

Feature	LiteLLM	CrabLLM
`/chat/completions`	yes	yes
`/embeddings`	yes	yes
`/models`	yes	yes
`/v1/messages` (Anthropic native)	—	yes
OpenAI provider	yes	yes
Anthropic provider	yes	yes
Google Gemini provider	yes	yes
Azure OpenAI provider	yes	yes
AWS Bedrock provider	yes	yes
Tool/function calling	yes	yes
SSE streaming	yes	yes
Virtual keys + auth	yes	yes
Weighted routing	yes	yes
Model aliasing	yes	yes
Retry + fallback	yes	yes
Rate limiting (RPM/TPM)	yes	yes
Cost/usage tracking	yes	yes
Budget enforcement	yes	yes
Request caching	yes	yes
Image/audio endpoints	yes	yes
Admin CLI (crabctl)	—	yes
Dynamic provider management	—	yes
OpenAPI / Scalar docs	—	yes
Storage (memory)	yes	yes
Storage (persistent)	Postgres	SQLite
Redis storage	yes	yes

Introduction

What it does

Why Rust

Feature comparison

On this page