Guardrails

Overview

Guardrails are a pipeline of rules that run before a request reaches any LLM provider. They can inspect, modify, or reject requests — giving you centralized control over every prompt that flows through GoModel. Guardrails work across all text-based endpoints:

/v1/chat/completions
/v1/responses
/v1/messages

Guardrails for images, TTS, STT, and video models are planned as a separate system and are not covered here.

Quick Start

Add a guardrails section to your config/config.yaml:

guardrails:
  enabled: true
  rules:
    - name: "safety-prompt"
      type: "system_prompt"
      order: 0
      system_prompt:
        mode: "decorator"
        content: "Always respond safely and respectfully."

That’s it. Every request now gets the safety prompt prepended to its system instructions.

Manage from the Dashboard

Guardrail definitions can also be created and edited from the admin dashboard instead of config.yaml — useful for iterating on rules without a redeploy, or for operators who don’t manage this repo’s config directly.

GoModel dashboard Guardrails page with a Guardrail Library summary card and an empty Instances list with a Create Guardrail button

Open Guardrails in the sidebar and click Create Guardrail to add an instance: give it a name, pick a type (system_prompt or llm_based_altering), optionally scope it to a user_path, and fill in the type-specific settings described below. config.yaml entries are seeded into the same store at startup and stay in sync with it, so dashboard-created and config-declared guardrails appear side by side.

Runtime guardrail execution still depends on GUARDRAILS_ENABLED. With it off, the page still lets you manage definitions — they just don’t run on live traffic yet.

How It Works

Messages are extracted from the incoming request into a normalized format
The guardrails pipeline processes the messages (inject, modify, or reject)
Modified messages are applied back to the original request
The request continues to the LLM provider

Guardrails never see the raw API request types — they operate on a normalized message list. This means the same guardrail works identically for /chat/completions, /responses, and /messages.

Execution Order

Each guardrail has an order value that controls when it runs:

Same order → run in parallel (concurrently)
Different order → run sequentially (ascending)

Each sequential group receives the output of the previous group. If any guardrail returns an error, the request is rejected and never reaches the provider.

Configuration

Full Structure

guardrails:
  enabled: true    # Master switch (default: false)
  rules:
    - name: "rule-name"         # Unique identifier for this instance
      type: "system_prompt"     # Guardrail type
      user_path: "/team/privacy" # Optional base path for internal auxiliary calls
      order: 0                  # Execution order
      system_prompt:            # Type-specific settings
        mode: "decorator"
        content: "Your prompt text here."

Environment Variable

You can toggle guardrails without editing the config file:

export GUARDRAILS_ENABLED=true

Rule Fields

Field	Required	Description
`name`	Yes	Human-readable identifier. Supports spaces and unicode, but not `/`.
`type`	Yes	Guardrail type: `system_prompt` or `llm_based_altering`.
`user_path`	No	Optional base user path for internal auxiliary guardrail requests.
`order`	No	Execution order. Default `0`. Same value = parallel, different = sequential.

Guardrail Types

`system_prompt`

Adds, replaces, or decorates the system prompt on every request.

Settings

Field	Required	Description
`mode`	No	`inject`, `override`, or `decorator`. Default: `inject`.
`content`	Yes	The system prompt text to apply.

Modes

inject
override
decorator

Adds a system message only if none exists. Existing system prompts are left untouched.

- name: "default-system"
  type: "system_prompt"
  order: 0
  system_prompt:
    mode: "inject"
    content: "You are a helpful assistant."

Behavior:

Request has no system prompt → adds one
Request already has a system prompt → no change

Replaces all existing system messages with the configured content.

- name: "strict-system"
  type: "system_prompt"
  order: 0
  system_prompt:
    mode: "override"
    content: "You are a compliance-focused assistant. Follow all company policies."

Behavior:

Any existing system prompts are removed
A single new system prompt is set

Prepends the configured content to the existing system prompt (separated by a newline). If no system prompt exists, adds one.

- name: "safety-prefix"
  type: "system_prompt"
  order: 0
  system_prompt:
    mode: "decorator"
    content: "Always respond safely and respectfully."

Behavior:

Existing system prompt "You are a coding assistant." becomes:

Always respond safely and respectfully.
You are a coding assistant.

No system prompt → creates one with just the configured content

`llm_based_altering`

Rewrites selected message roles by calling an auxiliary model before the main provider request runs. This is useful for PII anonymization and other prompt-preserving rewrites. The default prompt is derived from LiteLLM’s data_anonymization guardrail, so a minimal config acts as an anonymizing preprocessor.

Settings

Field	Required	Description
`model`	Yes	Auxiliary model selector used for the rewrite call.
`provider`	No	Optional routing hint for `model`.
`prompt`	No	Custom rewrite prompt. Defaults to the built-in anonymization prompt.
`roles`	No	Message roles to rewrite. Default: `["user"]`.
`skip_content_prefix`	No	Skip rewriting when the trimmed message starts with this prefix.
`max_tokens`	No	`max_tokens` for the auxiliary rewrite call. Default: `4096`.

When llm_based_altering calls the auxiliary model, GoModel runs that call through the normal translated request path in-process. That means ordinary workflow selection, failover, usage, audit, and cache behavior still apply. The internal request uses:

path: /v1/chat/completions
user path: {guardrail.user_path or caller user path}/guardrails/{guardrail name}
request origin: guardrail

Guardrails are explicitly skipped for that internal request to avoid recursion.

Example

- name: "privacy-rewrite"
  type: "llm_based_altering"
  user_path: "/team/privacy"
  order: 1
  llm_based_altering:
    model: "gpt-4o-mini"
    roles: ["user"]

Examples

Single Safety Guardrail

The simplest setup — add a safety prefix to every request:

guardrails:
  enabled: true
  rules:
    - name: "safety"
      type: "system_prompt"
      order: 0
      system_prompt:
        mode: "decorator"
        content: "Always be safe, respectful, and helpful."

Multiple Guardrails in Parallel

Two guardrails running at the same order execute concurrently:

guardrails:
  enabled: true
  rules:
    - name: "safety-prompt"
      type: "system_prompt"
      order: 0
      system_prompt:
        mode: "decorator"
        content: "Always be safe and respectful."

    - name: "compliance-prompt"
      type: "system_prompt"
      order: 0
      system_prompt:
        mode: "inject"
        content: "Follow all company compliance policies."

Sequential Pipeline

Guardrails with different orders run one after another. Later groups see the output of earlier ones:

guardrails:
  enabled: true
  rules:
    # Step 1: ensure a system prompt exists
    - name: "default-system"
      type: "system_prompt"
      order: 0
      system_prompt:
        mode: "inject"
        content: "You are a helpful assistant."

    # Step 2: decorate whatever system prompt is now present
    - name: "safety-prefix"
      type: "system_prompt"
      order: 1
      system_prompt:
        mode: "decorator"
        content: "[SAFETY] Always respond within company guidelines."

    # Step 3: anonymize user text before it reaches the main model
    - name: "privacy-rewrite"
      type: "llm_based_altering"
      order: 2
      llm_based_altering:
        model: "gpt-4o-mini"
        roles: ["user"]

Mixed Parallel and Sequential

guardrails:
  enabled: true
  rules:
    # Order 0: these two run in parallel
    - name: "safety"
      type: "system_prompt"
      order: 0
      system_prompt:
        mode: "decorator"
        content: "Be safe."

    - name: "policy"
      type: "system_prompt"
      order: 0
      system_prompt:
        mode: "inject"
        content: "Follow company policy."

    # Order 1: runs after both order-0 guardrails complete
    - name: "final-override"
      type: "system_prompt"
      order: 1
      system_prompt:
        mode: "decorator"
        content: "[FINAL CHECK]"

How It Works With Different Endpoints

Guardrails operate on a normalized message format internally. The adaptation between API-specific request types and this format happens automatically:

Endpoint	System prompt source	User messages source
`/v1/chat/completions`	`messages` with `role: "system"`	`messages` array
`/v1/responses`	`instructions` field	`input` field
`/v1/messages`	`system` field	`messages` array

You don’t need to think about which endpoint your users call. A single guardrail rule works identically for all supported text endpoints.

Errors and Rejection

If a guardrail returns an error, the request is rejected immediately. The error is returned to the client and the request never reaches the LLM provider. This is useful for future guardrail types that validate content (e.g., PII detection, content filtering). The system prompt guardrail does not reject requests — it only modifies them.

Getting Started

Features

MCP Proxy

Advanced

About

Overview

Quick Start

Manage from the Dashboard

How It Works

Execution Order

Configuration

Full Structure

Environment Variable

Rule Fields

Guardrail Types

`system_prompt`

Settings

Modes

`llm_based_altering`

Settings

Example

Examples

Single Safety Guardrail

Multiple Guardrails in Parallel

Sequential Pipeline

Mixed Parallel and Sequential

How It Works With Different Endpoints

Errors and Rejection

​Overview

​Quick Start

​Manage from the Dashboard

​How It Works

​Execution Order

​Configuration

​Full Structure

​Environment Variable

​Rule Fields

​Guardrail Types

​system_prompt

​Settings

​Modes

​llm_based_altering

​Settings

​Example

​Examples

​Single Safety Guardrail

​Multiple Guardrails in Parallel

​Sequential Pipeline

​Mixed Parallel and Sequential

​How It Works With Different Endpoints

​Errors and Rejection

Overview

Quick Start

Manage from the Dashboard

How It Works

Execution Order

Configuration

Full Structure

Environment Variable

Rule Fields

Guardrail Types

`system_prompt`

Settings

Modes

`llm_based_altering`

Settings

Example

Examples

Single Safety Guardrail

Multiple Guardrails in Parallel

Sequential Pipeline

Mixed Parallel and Sequential

How It Works With Different Endpoints

Errors and Rejection