Documentation Index
Fetch the complete documentation index at: https://gomodel.enterpilot.io/docs/llms.txt
Use this file to discover all available pages before exploring further.
Overview
GoModel accepts the Anthropic Messages API request dialect atPOST /v1/messages,
in addition to its OpenAI-compatible API. Clients and SDKs that speak the Anthropic
format can point at GoModel unchanged.
The request is translated to GoModel’s canonical chat type at ingress and runs through
the same pipeline as /v1/chat/completions — so model aliases, workflow policy,
budgets, failover, the response cache, usage/cost tracking, and audit logging all
apply. Because every provider implements chat completion, an Anthropic-format request
can be routed to any configured provider (OpenAI, Gemini, Bedrock, and others),
not only Anthropic.
This differs from the passthrough API: /p/anthropic/v1/messages
forwards bytes verbatim to the Anthropic upstream only, while the managed /v1/messages
endpoint routes anywhere and is fully managed.
Supported endpoints
| Endpoint | Behavior |
|---|---|
POST /v1/messages | Creates a message through translated model routing. Supports streaming (stream: true) with Anthropic-format SSE events. |
POST /v1/messages/count_tokens | Returns a heuristic input token estimate. |
Example
type: "message", content blocks,
stop_reason, usage). Errors use the Anthropic error envelope
({"type": "error", "error": {...}}). max_tokens is required, as in the Anthropic API.
Streaming responses emit the Anthropic SSE event sequence (message_start,
content_block_start/content_block_delta/content_block_stop, message_delta,
message_stop).
Cost tracking and audit logs
/v1/messages requests are tracked and audited exactly like the OpenAI-compatible
routes. Cost is computed from the actual provider that served the request, and usage
is recorded under the /v1/messages endpoint so it can be filtered in the dashboard.
Limitations
/v1/messages translates through GoModel’s canonical chat type. Anthropic-specific
features that have no canonical equivalent are not preserved end to end:
cache_controlbreakpoints are dropped — prompt-caching cost benefits are not carried through the canonical hop.- Extended-thinking signatures and
thinkingblocks on input messages are dropped. - Server/built-in tools (web search, code execution, …) are rejected with a clear
400; only custom tools (typeabsent or"custom") translate. top_kis dropped — it has no portable OpenAI-compatible equivalent, and OpenAI-family providers reject unknown request fields.temperatureandtop_pare forwarded.documentand other non-text/image content blocks are rejected with a clear400error rather than silently dropped.stop_sequencesare honored, but a stop-sequence-triggered completion reportsstop_reason: "end_turn"instead of"stop_sequence"(the output is still truncated correctly).count_tokensreturns a provider-agnostic heuristic estimate (≈ characters / 4), not a tokenizer-exact count. Use it for budgeting and UX sizing, not hard context-limit decisions.
/p/anthropic/v1/messages passthrough route instead.
See ADR-0007
for the design rationale and tradeoffs.