Policies - Anubis

Policies in Anubis define how to identify and handle different types of traffic. They consist of bot rules (pattern-based detection) and thresholds (weight-based triggers).

Policy Structure

Anubis policies are configured in YAML and loaded at startup:

bots:
  - name: verified-googlebot
    remote_addresses:
      - "66.249.64.0/19"
    action: ALLOW

  - name: suspicious-user-agents
    user_agent_regex: "(curl|wget|scrapy)"
    action: CHALLENGE
    challenge:
      algorithm: fast
      difficulty: 3

thresholds:
  - name: high-suspicion
    expression: "weight >= 10"
    action: CHALLENGE
    challenge:
      algorithm: fast
      difficulty: 4

Bot Rules

Bot rules are evaluated sequentially. The first matching rule with a terminal action (ALLOW, DENY, CHALLENGE, BENCHMARK) determines the request’s fate.

Rule Definition

// From lib/config/config.go:58
type BotConfig struct {
    UserAgentRegex *string           `json:"user_agent_regex,omitempty"`
    PathRegex      *string           `json:"path_regex,omitempty"`
    HeadersRegex   map[string]string `json:"headers_regex,omitempty"`
    Expression     *ExpressionOrList `json:"expression,omitempty"`
    Challenge      *ChallengeRules   `json:"challenge,omitempty"`
    Weight         *Weight           `json:"weight,omitempty"`
    GeoIP          *GeoIP            `json:"geoip,omitempty"`
    ASNs           *ASNs             `json:"asns,omitempty"`
    Name           string            `json:"name"`
    Action         Rule              `json:"action"`
    RemoteAddr     []string          `json:"remote_addresses,omitempty"`
}

Matching Conditions

Rules can match requests using multiple conditions (AND logic):

User Agent (Regex)

Match against the User-Agent header:

- name: block-python-scrapers
  user_agent_regex: "python-requests|httpx|aiohttp"
  action: DENY

Implementation: lib/policy/checker.go:NewUserAgentChecker()

Path (Regex)

Match against the request path:

- name: protect-admin
  path_regex: "^/admin/.*"
  action: CHALLENGE
  challenge:
    difficulty: 5

Implementation: lib/policy/checker.go:NewPathChecker()

Headers (Regex Map)

Match multiple headers with different patterns:

- name: api-key-check
  headers_regex:
    X-API-Key: "^key-[a-f0-9]{32}$"
    Accept: "application/json"
  action: ALLOW

Use .* to check if a header exists:

headers_regex:
  X-Custom-Header: ".*"  # Just check presence

Implementation: lib/policy/checker.go:NewHeadersChecker()

Remote Addresses (CIDR)

Match against client IP addresses:

- name: internal-network
  remote_addresses:
    - "10.0.0.0/8"
    - "172.16.0.0/12"
    - "192.168.0.0/16"
  action: ALLOW

Implementation: lib/policy/checker.go:NewRemoteAddrChecker() using gaissmai/bart prefix table

CEL Expressions

Advanced matching with Common Expression Language:

- name: rate-limit-trigger
  expression:
    - "req.headers['x-forwarded-for'].size() > 0"
    - "req.path.startsWith('/api/')"
    - "req.method in ['POST', 'PUT', 'DELETE']"
  action: WEIGH
  weight:
    adjust: 5

Available variables:

req.method (string)
req.path (string)
req.headers (map)
req.query (map)
DNS lookups (via expressions)

Implementation: lib/policy/celchecker.go:NewCELChecker()

GeoIP (Thoth Integration)

Match by country code (requires Thoth):

- name: block-regions
  geoip:
    countries:
      - CN
      - RU
  action: DENY

Requires: Thoth service configured via ANUBIS_THOTH_URLImplementation: lib/thoth/geoipchecker.go

ASN (Thoth Integration)

Match by Autonomous System Number:

- name: cloud-providers
  asns:
    match:
      - 16509  # Amazon AWS
      - 15169  # Google Cloud
      - 8075   # Microsoft Azure
  action: CHALLENGE
  challenge:
    difficulty: 2

Requires: Thoth serviceImplementation: lib/thoth/asnchecker.go

All conditions within a single bot rule are AND-ed together. If you specify both user_agent_regex and path_regex, the request must match both.

Rule Validation

Rules are validated on startup:

// From lib/config/config.go:95
func (b *BotConfig) Valid() error {
    var errs []error
    
    if b.Name == "" {
        errs = append(errs, ErrBotMustHaveName)
    }
    
    // Must have at least one matching condition
    allFieldsEmpty := b.UserAgentRegex == nil &&
        b.PathRegex == nil &&
        len(b.RemoteAddr) == 0 &&
        len(b.HeadersRegex) == 0 &&
        b.ASNs == nil &&
        b.GeoIP == nil
    
    if allFieldsEmpty && b.Expression == nil {
        errs = append(errs, ErrBotMustHaveUserAgentOrPath)
    }
    
    // Validate regexes compile
    if b.UserAgentRegex != nil {
        if _, err := regexp.Compile(*b.UserAgentRegex); err != nil {
            errs = append(errs, ErrInvalidUserAgentRegex, err)
        }
    }
    
    return errors.Join(errs...)
}

Actions

Bot rules can specify five different actions:

ALLOW

action

Immediately proxy the request to upstream without challenge.

- name: verified-bot
  remote_addresses:
    - "66.249.64.0/19"
  action: ALLOW

DENY

action

Block the request with a 403 Forbidden response.

- name: blacklisted-ips
  remote_addresses:
    - "203.0.113.0/24"
  action: DENY

CHALLENGE

action

Issue a proof-of-work challenge. Requires challenge configuration.

- name: suspicious-bot
  user_agent_regex: "bot|crawler|spider"
  action: CHALLENGE
  challenge:
    algorithm: fast
    difficulty: 3

WEIGH

action

Adjust the request’s suspicion weight and continue evaluation.

- name: missing-common-headers
  expression:
    - "!has(req.headers['accept-language'])"
  action: WEIGH
  weight:
    adjust: 5

Default weight adjustment is 5 if not specified.

DEBUG_BENCHMARK

action

Render a benchmark page for testing challenge performance.

- name: benchmark-endpoint
  path_regex: "^/__benchmark$"
  action: DEBUG_BENCHMARK

Action Flow

// From lib/anubis.go:609
for _, b := range s.policy.Bots {
    match, err := b.Rules.Check(r)
    if match {
        switch b.Action {
        case config.RuleDeny, config.RuleAllow, 
             config.RuleBenchmark, config.RuleChallenge:
            // Terminal action - return immediately
            return cr("bot/"+b.Name, b.Action, weight), &b, nil
        case config.RuleWeigh:
            // Non-terminal - accumulate weight and continue
            weight += b.Weight.Adjust
        }
    }
}

Order matters! Place ALLOW rules for verified bots first, then WEIGH rules to accumulate suspicion, and finally DENY/CHALLENGE rules.

Thresholds

Thresholds evaluate accumulated weight from WEIGH actions using CEL expressions:

thresholds:
  - name: low-suspicion
    expression: "weight >= 5 && weight < 10"
    action: CHALLENGE
    challenge:
      algorithm: fast
      difficulty: 2

  - name: high-suspicion
    expression: "weight >= 10"
    action: CHALLENGE
    challenge:
      algorithm: fast
      difficulty: 5

  - name: extreme-suspicion
    expression: "weight >= 20"
    action: DENY

Threshold Evaluation

// From lib/anubis.go:627
for _, t := range s.policy.Thresholds {
    result, _, err := t.Program.ContextEval(
        r.Context(), 
        &policy.ThresholdRequest{Weight: weight}
    )
    
    if matches {
        return cr("threshold/"+t.Name, t.Action, weight), &policy.Bot{
            Challenge: t.Challenge,
            Rules:     &checker.List{},
        }, nil
    }
}

Thresholds are evaluated in order. The first matching threshold determines the action.

Threshold Definition

// From lib/config/threshold.go:32
type Threshold struct {
    Expression *ExpressionOrList `json:"expression"`
    Challenge  *ChallengeRules   `json:"challenge"`
    Name       string            `json:"name"`
    Action     Rule              `json:"action"`
}

Thresholds cannot use the WEIGH action - this validation error occurs at config load time:

if t.Action == RuleWeigh {
    errs = append(errs, ErrThresholdCannotHaveWeighAction)
}

Rule Hashing

Each bot rule is hashed to detect policy changes:

// From lib/policy/bot.go:19
func (b Bot) Hash() string {
    return internal.FastHash(fmt.Sprintf("%s::%s", b.Name, b.Rules.Hash()))
}

This hash is embedded in JWTs. When you update your policy:

Rule hash changes
Existing JWTs with old hash fail validation
Clients must re-solve challenges

This prevents bypassing updated security rules with old tokens.

Check Result

Policy evaluation returns a CheckResult:

// From lib/policy/checkresult.go:9
type CheckResult struct {
    Name   string       // e.g., "bot/suspicious-crawler" or "threshold/high-suspicion"
    Rule   config.Rule  // ALLOW, DENY, CHALLENGE, etc.
    Weight int          // Accumulated weight
}

This is logged and exposed in Prometheus metrics:

anubis_policy_results{rule="bot/verified-googlebot",action="ALLOW"} 1234
anubis_policy_results{rule="threshold/high-suspicion",action="CHALLENGE"} 567

Import Statements

Reuse bot rules across multiple configs:

# main-policy.yaml
bots:
  - import: "(data)/verified-bots.yaml"  # Built-in
  - import: "/etc/anubis/custom-rules.yaml"  # External
  
  - name: site-specific-rule
    path_regex: "^/protected/"
    action: CHALLENGE

# verified-bots.yaml
- name: googlebot
  remote_addresses:
    - "66.249.64.0/19"
  action: ALLOW

- name: bingbot
  remote_addresses:
    - "40.77.167.0/24"
  action: ALLOW

Use the (data)/ prefix to import built-in bot policies shipped with Anubis. These are embedded at compile time.

CEL Expressions

Anubis supports Common Expression Language for advanced matching:

Available Functions

req.path.startsWith('/api/')
req.headers['user-agent'].contains('Mobile')
req.method.matches('^(POST|PUT|DELETE)$')

Environment Variables

Expressions have access to:

// From lib/policy/expressions/
- req.method (string)
- req.path (string)
- req.headers (map<string, string>)
- req.query (map<string, string>)
- loadavg() (float, Linux only)
- dns.forward(ip) ([]string)
- dns.reverse(hostname) ([]string)

Example Policies

# Escalate difficulty based on behavior
bots:
  # Known good bots
  - name: verified-crawlers
    import: "(data)/verified-bots.yaml"
  
  # Add suspicion for missing headers
  - name: missing-language
    expression:
      - "!has(req.headers['accept-language'])"
    action: WEIGH
    weight:
      adjust: 3
  
  - name: missing-encoding
    expression:
      - "!has(req.headers['accept-encoding'])"
    action: WEIGH
    weight:
      adjust: 3
  
  # Add suspicion for scraper user agents
  - name: scraper-ua
    user_agent_regex: "(curl|wget|python|scrapy)"
    action: WEIGH
    weight:
      adjust: 10

thresholds:
  # Light challenge for moderate suspicion
  - name: moderate
    expression: "weight >= 5 && weight < 10"
    action: CHALLENGE
    challenge:
      algorithm: fast
      difficulty: 2
  
  # Heavy challenge for high suspicion
  - name: high
    expression: "weight >= 10"
    action: CHALLENGE
    challenge:
      algorithm: fast
      difficulty: 4

Default Behavior

If no bot rules or thresholds match, Anubis allows the request:

// From lib/anubis.go:648
return cr("default/allow", config.RuleAllow, weight), &policy.Bot{
    Challenge: &config.ChallengeRules{
        Difficulty: s.policy.DefaultDifficulty,
        Algorithm:  config.DefaultAlgorithm,
    },
    Rules: &checker.List{},
}, nil

This “default allow” behavior means Anubis is not a firewall by itself. It only challenges/blocks traffic that matches your rules. Combine it with proper network security.

Metrics and Monitoring

Policy decisions are tracked:

# Rule application counts
anubis_policy_results{rule="bot/verified-googlebot",action="ALLOW"}
anubis_policy_results{rule="bot/suspicious-crawler",action="CHALLENGE"}
anubis_policy_results{rule="threshold/high-suspicion",action="CHALLENGE"}
anubis_policy_results{rule="bot/blocklist",action="DENY"}

Request headers include policy metadata:

X-Anubis-Rule: bot/suspicious-crawler
X-Anubis-Action: CHALLENGE
X-Anubis-Status: PASS

Best Practices

Order Rules Carefully

Place ALLOW rules first, then WEIGH, then terminal actions.

Use Imports

Reuse verified bot lists with import: "(data)/verified-bots.yaml".

Start Conservative

Begin with WEIGH actions and observe metrics before adding DENY rules.

Test Expressions

Use DEBUG_BENCHMARK action on test endpoints to verify CEL expressions.

Next Steps

Challenges

Configure challenge difficulty and algorithms

Architecture

Understand how policies integrate with the proxy

Documentation Index

​Policy Structure

​Bot Rules

​Rule Definition

​Matching Conditions

​Rule Validation

​Actions

​Action Flow

​Thresholds

​Threshold Evaluation

​Threshold Definition

​Rule Hashing

​Check Result

​Import Statements

​CEL Expressions

​Available Functions

​Environment Variables

​Example Policies

​Default Behavior

​Metrics and Monitoring

​Best Practices

Order Rules Carefully

Use Imports

Start Conservative

Test Expressions

​Next Steps

Challenges

Architecture

Policy Structure

Bot Rules

Rule Definition

Matching Conditions

Rule Validation

Actions

Action Flow

Thresholds

Threshold Evaluation

Threshold Definition

Rule Hashing

Check Result

Import Statements

CEL Expressions

Available Functions

Environment Variables

Example Policies

Default Behavior

Metrics and Monitoring

Best Practices

Next Steps