Alibaba Cloud Model Studio

Alibaba Cloud Model Studio is Alibaba Cloud’s model-as-a-service platform for the Qwen family of models. The service has also been called Bailian (百炼) and DashScope in consoles, SDKs, and API documentation. GoModel routes to it through the OpenAI-compatible endpoint (/compatible-mode/v1).

GoModel keeps the provider ID, environment variables, and passthrough path as bailian, BAILIAN_*, and /p/bailian/... for compatibility.

Because Bailian deprecated max_tokens in April 2026 in favor of max_completion_tokens, GoModel automatically maps the standard max_tokens field to max_completion_tokens for every request — no client change required.

Configure

BAILIAN_API_KEY=...

Or in config.yaml:

providers:
  bailian:
    type: bailian
    api_key: "${BAILIAN_API_KEY}"
    # base_url: "https://dashscope.aliyuncs.com/compatible-mode/v1"

Base URLs

Model Studio’s OpenAI-compatible API is available in multiple regions. Set BAILIAN_BASE_URL to switch:

Region	URL
Beijing (default)	`https://dashscope.aliyuncs.com/compatible-mode/v1`
Singapore	`https://{workspace-id}.ap-southeast-1.maas.aliyuncs.com/compatible-mode/v1`
Frankfurt	`https://{workspace-id}.eu-central-1.maas.aliyuncs.com/compatible-mode/v1`
Hong Kong	`https://{workspace-id}.cn-hongkong.maas.aliyuncs.com/compatible-mode/v1`

Model IDs

Common Qwen model identifiers — check the Model Studio model list for the current catalog:

Model	Example ID
Qwen 3.7 Max	`qwen3.7-max`
Qwen 3.7 Plus	`qwen3.7-plus`
Qwen 3.6 Flash	`qwen3.6-flash`
Qwen 3 Max	`qwen3-max`
Qwen 3 Plus	`qwen3-plus`
Qwen 3 Flash	`qwen3-flash`
Qwen 3 Coder Plus	`qwen3-coder-plus`
Text Embedding	`text-embedding-v3`

`max_tokens` compatibility

Model Studio deprecated max_tokens on 2026-04-20 (effective 2026-05-30). Its compatible-mode models now require max_completion_tokens instead. GoModel transparently maps the standard max_tokens parameter to max_completion_tokens for every bailian provider request — send max_tokens as you normally would, and GoModel rewrites it before forwarding to Model Studio.

# max_tokens=4096 is automatically sent as max_completion_tokens=4096
curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-max",
    "messages": [{"role": "user", "content": "Hello"}],
    "max_tokens": 4096
  }'

Supported features

Feature	Supported
Chat completions	✅
Streaming chat	✅
Responses (`/v1/responses`)	✅ (translated to chat)
Embeddings	✅ (configure model IDs via `BAILIAN_MODELS`)
Files (`/v1/files`)	✅
Batches (`/v1/batches`)	✅
Passthrough (`/p/bailian/...`)	✅

Embedding models (text-embedding-v3, text-embedding-v4) are served by the compatible-mode API but are not auto-discovered from the upstream /v1/models endpoint. Set BAILIAN_MODELS=text-embedding-v3,text-embedding-v4 or use CONFIGURED_PROVIDER_MODELS_MODE=allowlist to make them available.

​Configure

​Base URLs

​Model IDs

​max_tokens compatibility

​Supported features

​References

Configure

Base URLs

Model IDs

`max_tokens` compatibility

Supported features

References