OpenAI-Compatible API
| Endpoint | Method | Description |
|---|---|---|
/v1/chat/completions | POST | Chat completions (streaming supported) |
/v1/responses | POST | Create an OpenAI Responses API response |
/v1/responses/{id} | GET | Retrieve a stored response |
/v1/responses/{id} | DELETE | Delete a stored response (forwards native deletion where supported) |
/v1/responses/{id}/cancel | POST | Cancel an in-progress response (provider-native where supported) |
/v1/responses/{id}/input_items | GET | List the input items of a stored response |
/v1/responses/input_tokens | POST | Count input tokens for a Responses request |
/v1/responses/compact | POST | Compact a Responses conversation (provider-native where supported) |
/v1/conversations | POST | Create a conversation (gateway-managed) |
/v1/conversations/{id} | GET | Retrieve a conversation |
/v1/conversations/{id} | POST | Replace conversation metadata in full |
/v1/conversations/{id} | DELETE | Delete a conversation |
/v1/embeddings | POST | Text embeddings |
/v1/models | GET | List available models |
/v1/audio/speech | POST | Text-to-speech, returning binary audio |
/v1/audio/transcriptions | POST | Speech-to-text from a multipart upload |
/v1/realtime | GET | Realtime speech-to-speech websocket upgrade (when REALTIME_ENABLED) |
/v1/files | POST | Upload a file (OpenAI-compatible multipart) |
/v1/files | GET | List files |
/v1/files/{id} | GET | Retrieve file metadata |
/v1/files/{id} | DELETE | Delete a file |
/v1/files/{id}/content | GET | Retrieve raw file content |
/v1/batches | POST | Create a native provider batch (OpenAI-compatible schema; inline requests supported where provider-native) |
/v1/batches | GET | List stored batches |
/v1/batches/{id} | GET | Retrieve one stored batch |
/v1/batches/{id}/cancel | POST | Cancel a pending batch |
/v1/batches/{id}/results | GET | Retrieve native batch results when available |
Anthropic-Compatible API
| Endpoint | Method | Description |
|---|---|---|
/v1/messages | POST | Anthropic Messages API through translated model routing (streaming supported) |
/v1/messages/count_tokens | POST | Heuristic Anthropic Messages input token estimate |
Provider Passthrough
| Endpoint | Method | Description |
|---|---|---|
/p/{provider}/... | GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS | Provider-native passthrough with opaque upstream responses |
Admin Endpoints
Admin REST and dashboard routes (/admin/*) are covered in
Admin Endpoints.
Operations Endpoints
| Endpoint | Method | Description |
|---|---|---|
/health | GET | Liveness check (always 200 while the process serves) |
/health/ready | GET | Readiness check: pings storage (503 if down) and Redis cache (degraded, still 200) |
/metrics | GET | Prometheus metrics (experimental, when enabled) |
/swagger/index.html | GET | Swagger UI (when enabled) |