Introduction
High-performance LLM API gateway in Rust — one API format, many providers, sub-millisecond overhead.
CrabLLM is a high-performance LLM API gateway written in Rust. It sits between your application and LLM providers, exposing an OpenAI-compatible API surface.
One API format. Many providers. Low overhead.
What it does
You send requests in OpenAI format to CrabLLM. It routes them to the configured provider — OpenAI, Anthropic, Google Gemini, Azure OpenAI, AWS Bedrock, or Ollama — translating the request and response as needed.
Your application talks to one endpoint. CrabLLM handles the rest:
- Provider translation — Anthropic, Google, and Bedrock have their own API formats. CrabLLM translates automatically.
- Routing — Weighted random selection across multiple providers for the same model. Automatic fallback when a provider fails.
- Streaming — SSE streaming proxied without buffering.
- Auth — Virtual API keys with per-key model access control.
- Extensions — Rate limiting, caching, cost tracking, budget enforcement.
Why Rust
- Sub-millisecond overhead — no GC pauses, no interpreter startup.
- Memory safety — without runtime cost.
- Concurrency — Tokio async runtime handles thousands of concurrent streaming connections efficiently.
- Deployment — single static binary. No interpreter, no virtualenv, no Docker required.
Feature comparison
| Feature | LiteLLM | CrabLLM |
|---|---|---|
/chat/completions | yes | yes |
/embeddings | yes | yes |
/models | yes | yes |
| OpenAI provider | yes | yes |
| Anthropic provider | yes | yes |
| Google Gemini provider | yes | yes |
| Azure OpenAI provider | yes | yes |
| AWS Bedrock provider | yes | yes |
| Tool/function calling | yes | yes |
| SSE streaming | yes | yes |
| Virtual keys + auth | yes | yes |
| Weighted routing | yes | yes |
| Model aliasing | yes | yes |
| Retry + fallback | yes | yes |
| Rate limiting (RPM/TPM) | yes | yes |
| Cost/usage tracking | yes | yes |
| Budget enforcement | yes | yes |
| Request caching | yes | yes |
| Image/audio endpoints | yes | yes |
| Storage (memory) | yes | yes |
| Storage (persistent) | Postgres | SQLite |
| Redis storage | yes | yes |