Must Have
- Intelligent routing
- Broader provider support: Cohere, Command A, and Operational
- Budget management with limits per
user_pathand/or API key - Editable model pricing for accurate cost tracking and budgeting
- Full support for the OpenAI
/responseslifecycle - Anthropic-compatible
/messagesingress and/messages/count_tokens - Full support for the OpenAI
/conversationslifecycle - Prompt cache visibility showing how much of each prompt was cached by the provider
- Guardrails hardening: better UI, simpler architecture, easier custom guardrails, and response-side guardrails before output reaches the client
- Passthrough for all providers, beyond the current OpenAI and Anthropic beta
- Fix failover charts in the dashboard
Should Have
- Cluster mode