Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/TecharoHq/Anubis/llms.txt

Use this file to discover all available pages before exploring further.

Anubis uses Google’s Common Expression Language (CEL) for flexible policy expressions. CEL allows you to write complex bot detection rules using a familiar, safe syntax.

Expression Environment

CEL expressions in Anubis have access to request properties and specialized functions:
func BotEnvironment(dnsObj *dns.Dns) (*cel.Env, error) {
	return New(
		// Variables exposed to CEL programs:
		cel.Variable("remoteAddress", cel.StringType),
		cel.Variable("contentLength", cel.IntType),
		cel.Variable("host", cel.StringType),
		cel.Variable("method", cel.StringType),
		cel.Variable("userAgent", cel.StringType),
		cel.Variable("path", cel.StringType),
		cel.Variable("query", cel.MapType(cel.StringType, cel.StringType)),
		cel.Variable("headers", cel.MapType(cel.StringType, cel.StringType)),
		cel.Variable("load_1m", cel.DoubleType),
		cel.Variable("load_5m", cel.DoubleType),
		cel.Variable("load_15m", cel.DoubleType),
		// ... custom functions ...
	)
}
Source: lib/policy/expressions/environment.go:19-33

Available Variables

Request Properties

  • remoteAddress (string): Client IP address from X-Real-Ip header
  • contentLength (int): Request body size in bytes
  • host (string): Host header value
  • method (string): HTTP method (GET, POST, etc.)
  • userAgent (string): User-Agent header
  • path (string): URL path component
  • query (map[string]string): Query parameters
  • headers (map[string]string): All HTTP headers
Source: lib/policy/celchecker.go:59-88

System Load

  • load_1m (double): System load average over 1 minute
  • load_5m (double): System load average over 5 minutes
  • load_15m (double): System load average over 15 minutes
Load values are updated every 15 seconds in a background thread. Source: lib/policy/expressions/loadavg.go:53-69

Built-in Functions

String Manipulation

regexSafe(string) string

Escapes a string for safe insertion into regular expressions:
expression:
  - path.matches('^/' + regexSafe(segments(path)[0]) + '/.*')
Source: lib/policy/expressions/environment.go:152-171

segments(string) list[string]

Splits a path into segments:
expression:
  - segments(path)[0] == 'api'  # First path segment is 'api'
  - segments(path).size() > 3    # Path has more than 3 segments
Source: lib/policy/expressions/environment.go:173-194

DNS Functions

reverseDNS(string) list[string]

Performs reverse DNS lookup on an IP address:
expression:
  - reverseDNS(remoteAddress).exists(name, name.endsWith('.googlebot.com'))
Source: lib/policy/expressions/environment.go:61-78

lookupHost(string) list[string]

Resolves a hostname to IP addresses:
expression:
  - lookupHost('example.com').exists(ip, ip == remoteAddress)
Source: lib/policy/expressions/environment.go:80-97

verifyFCrDNS(string) bool

verifyFCrDNS(string, string) bool

Verifies Forward-Confirmed reverse DNS (FCrDNS). Optionally accepts a regex pattern:
expression:
  # Verify FCrDNS matches any pattern
  - verifyFCrDNS(remoteAddress)
  
  # Verify FCrDNS matches specific pattern
  - verifyFCrDNS(remoteAddress, '.*\\.googlebot\\.com$')
Source: lib/policy/expressions/environment.go:99-127

arpaReverseIP(string) string

Transforms an IP address into ARPA reverse notation:
expression:
  # IPv4: 1.2.3.4 -> 4.3.2.1
  # IPv6: 2001:db8::1 -> 1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.8.b.d.0.1.0.0.2
  - arpaReverseIP(remoteAddress).endsWith('.in-addr.arpa')
Source: lib/policy/expressions/environment.go:132-149

Header Functions

missingHeader(map, string) bool

Checks if a specific header is missing:
expression:
  - missingHeader(headers, 'User-Agent')
  - missingHeader(headers, 'Accept-Language')
Source: lib/policy/expressions/environment.go:35-59

Random Functions

randInt(int) int

Generates a random integer from 0 to n-1:
expression:
  - randInt(100) < 10  # 10% probability
Source: lib/policy/expressions/environment.go:215-228

String Extensions

Anubis includes the CEL strings extension:
expression:
  - userAgent.contains('bot')
  - path.startsWith('/api/')
  - host.endsWith('.example.com')
  - method.lowerAscii() == 'post'
  - userAgent.matches('(?i).*crawler.*')
Source: lib/policy/expressions/environment.go:206-209

Type Wrappers

Anubis provides CEL type wrappers for HTTP headers and query parameters:

HTTPHeaders

type HTTPHeaders struct {
	http.Header
}

func (h HTTPHeaders) Get(key ref.Val) ref.Val {
	result, ok := h.Find(key)
	if !ok {
		return types.ValOrErr(result, "no such key: %v", key)
	}
	return result
}
Source: lib/policy/expressions/http_headers.go:14-67

URLValues

type URLValues struct {
	url.Values
}

func (u URLValues) Find(key ref.Val) (ref.Val, bool) {
	k, ok := key.(types.String)
	if !ok {
		return nil, false
	}

	if _, ok := u.Values[string(k)]; !ok {
		return nil, false
	}

	return types.String(strings.Join(u.Values[string(k)], ",")), true
}
Source: lib/policy/expressions/url_values.go:16-56

Example Expressions

Block Specific User Agents

bots:
  - name: block-bad-bots
    action: deny
    expression:
      - userAgent.matches('(?i).*(bot|crawler|spider).*')
      - '!verifyFCrDNS(remoteAddress)'

Rate Limiting by Load

bots:
  - name: high-load-protection
    action: challenge
    expression:
      - load_1m > 4.0
      - path.startsWith('/expensive-operation')

Geographic Restrictions

bots:
  - name: verify-search-engines
    action: allow
    expression:
      - verifyFCrDNS(remoteAddress, '.*\\.(googlebot|bingbot)\\.com$')

  - name: challenge-unverified
    action: challenge
    expression:
      - userAgent.contains('bot')

Missing Headers Detection

bots:
  - name: suspicious-clients
    action: challenge
    expression:
      - missingHeader(headers, 'Accept')
      - missingHeader(headers, 'Accept-Language')
      - missingHeader(headers, 'Accept-Encoding')

Path Segment Matching

bots:
  - name: protect-api
    action: challenge
    expression:
      - segments(path)[0] == 'api'
      - segments(path)[1] in ['v1', 'v2']

Query Parameter Validation

bots:
  - name: require-auth-token
    action: deny
    expression:
      - path.startsWith('/admin')
      - '!query.contains("token")'

Compilation and Execution

Expressions are compiled at startup for performance:
func Compile(env *cel.Env, src string) (cel.Program, error) {
	intermediate, iss := env.Compile(src)
	if iss != nil {
		return nil, iss.Err()
	}

	ast, iss := env.Check(intermediate)
	if iss != nil {
		return nil, iss.Err()
	}

	return env.Program(
		ast,
		cel.EvalOptions(
			// optimize regular expressions right now instead of on the fly
			cel.OptOptimize,
		),
	)
}
Source: lib/policy/expressions/environment.go:237-255

Request Activation

CEL variables are resolved from HTTP requests:
func (cr *CELRequest) ResolveName(name string) (any, bool) {
	switch name {
	case "remoteAddress":
		return cr.Header.Get("X-Real-Ip"), true
	case "contentLength":
		return cr.ContentLength, true
	case "host":
		return cr.Host, true
	case "method":
		return cr.Method, true
	case "userAgent":
		return cr.UserAgent(), true
	case "path":
		return cr.URL.Path, true
	case "query":
		return expressions.URLValues{Values: cr.URL.Query()}, true
	case "headers":
		return expressions.HTTPHeaders{Header: cr.Header}, true
	case "load_1m":
		return expressions.Load1(), true
	case "load_5m":
		return expressions.Load5(), true
	case "load_15m":
		return expressions.Load15(), true
	default:
		return nil, false
	}
}
Source: lib/policy/celchecker.go:61-88

Best Practices

  1. Validate at startup: CEL expressions are compiled during config parsing to catch errors early
  2. Use standard library: Leverage CEL’s built-in string, list, and map functions
  3. Cache DNS results: DNS functions use a TTL-based cache to avoid repeated lookups
  4. Combine conditions: Use logical operators (&&, ||, !) to build complex rules
  5. Test expressions: Invalid CEL syntax causes Anubis to refuse to start
  6. Mind performance: DNS lookups and regex matching add latency; use judiciously

Threshold Expressions

Thresholds use a simplified CEL environment with only the weight variable:
func ThresholdEnvironment() (*cel.Env, error) {
	return New(
		cel.Variable("weight", cel.IntType),
	)
}
Example threshold configuration:
thresholds:
  - name: high-weight-challenge
    expression: weight > 100
    challenge:
      algorithm: fast
      difficulty: 10
Source: lib/policy/expressions/environment.go:198-202