Documentation Index
Fetch the complete documentation index at: https://mintlify.com/TecharoHq/Anubis/llms.txt
Use this file to discover all available pages before exploring further.
Anubis uses Google’s Common Expression Language (CEL) for flexible policy expressions. CEL allows you to write complex bot detection rules using a familiar, safe syntax.
Expression Environment
CEL expressions in Anubis have access to request properties and specialized functions:
func BotEnvironment(dnsObj *dns.Dns) (*cel.Env, error) {
return New(
// Variables exposed to CEL programs:
cel.Variable("remoteAddress", cel.StringType),
cel.Variable("contentLength", cel.IntType),
cel.Variable("host", cel.StringType),
cel.Variable("method", cel.StringType),
cel.Variable("userAgent", cel.StringType),
cel.Variable("path", cel.StringType),
cel.Variable("query", cel.MapType(cel.StringType, cel.StringType)),
cel.Variable("headers", cel.MapType(cel.StringType, cel.StringType)),
cel.Variable("load_1m", cel.DoubleType),
cel.Variable("load_5m", cel.DoubleType),
cel.Variable("load_15m", cel.DoubleType),
// ... custom functions ...
)
}
Source: lib/policy/expressions/environment.go:19-33
Available Variables
Request Properties
remoteAddress (string): Client IP address from X-Real-Ip header
contentLength (int): Request body size in bytes
host (string): Host header value
method (string): HTTP method (GET, POST, etc.)
userAgent (string): User-Agent header
path (string): URL path component
query (map[string]string): Query parameters
headers (map[string]string): All HTTP headers
Source: lib/policy/celchecker.go:59-88
System Load
load_1m (double): System load average over 1 minute
load_5m (double): System load average over 5 minutes
load_15m (double): System load average over 15 minutes
Load values are updated every 15 seconds in a background thread.
Source: lib/policy/expressions/loadavg.go:53-69
Built-in Functions
String Manipulation
regexSafe(string) string
Escapes a string for safe insertion into regular expressions:
expression:
- path.matches('^/' + regexSafe(segments(path)[0]) + '/.*')
Source: lib/policy/expressions/environment.go:152-171
segments(string) list[string]
Splits a path into segments:
expression:
- segments(path)[0] == 'api' # First path segment is 'api'
- segments(path).size() > 3 # Path has more than 3 segments
Source: lib/policy/expressions/environment.go:173-194
DNS Functions
reverseDNS(string) list[string]
Performs reverse DNS lookup on an IP address:
expression:
- reverseDNS(remoteAddress).exists(name, name.endsWith('.googlebot.com'))
Source: lib/policy/expressions/environment.go:61-78
lookupHost(string) list[string]
Resolves a hostname to IP addresses:
expression:
- lookupHost('example.com').exists(ip, ip == remoteAddress)
Source: lib/policy/expressions/environment.go:80-97
verifyFCrDNS(string) bool
verifyFCrDNS(string, string) bool
Verifies Forward-Confirmed reverse DNS (FCrDNS). Optionally accepts a regex pattern:
expression:
# Verify FCrDNS matches any pattern
- verifyFCrDNS(remoteAddress)
# Verify FCrDNS matches specific pattern
- verifyFCrDNS(remoteAddress, '.*\\.googlebot\\.com$')
Source: lib/policy/expressions/environment.go:99-127
arpaReverseIP(string) string
Transforms an IP address into ARPA reverse notation:
expression:
# IPv4: 1.2.3.4 -> 4.3.2.1
# IPv6: 2001:db8::1 -> 1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.8.b.d.0.1.0.0.2
- arpaReverseIP(remoteAddress).endsWith('.in-addr.arpa')
Source: lib/policy/expressions/environment.go:132-149
Checks if a specific header is missing:
expression:
- missingHeader(headers, 'User-Agent')
- missingHeader(headers, 'Accept-Language')
Source: lib/policy/expressions/environment.go:35-59
Random Functions
randInt(int) int
Generates a random integer from 0 to n-1:
expression:
- randInt(100) < 10 # 10% probability
Source: lib/policy/expressions/environment.go:215-228
String Extensions
Anubis includes the CEL strings extension:
expression:
- userAgent.contains('bot')
- path.startsWith('/api/')
- host.endsWith('.example.com')
- method.lowerAscii() == 'post'
- userAgent.matches('(?i).*crawler.*')
Source: lib/policy/expressions/environment.go:206-209
Type Wrappers
Anubis provides CEL type wrappers for HTTP headers and query parameters:
type HTTPHeaders struct {
http.Header
}
func (h HTTPHeaders) Get(key ref.Val) ref.Val {
result, ok := h.Find(key)
if !ok {
return types.ValOrErr(result, "no such key: %v", key)
}
return result
}
Source: lib/policy/expressions/http_headers.go:14-67
URLValues
type URLValues struct {
url.Values
}
func (u URLValues) Find(key ref.Val) (ref.Val, bool) {
k, ok := key.(types.String)
if !ok {
return nil, false
}
if _, ok := u.Values[string(k)]; !ok {
return nil, false
}
return types.String(strings.Join(u.Values[string(k)], ",")), true
}
Source: lib/policy/expressions/url_values.go:16-56
Example Expressions
Block Specific User Agents
bots:
- name: block-bad-bots
action: deny
expression:
- userAgent.matches('(?i).*(bot|crawler|spider).*')
- '!verifyFCrDNS(remoteAddress)'
Rate Limiting by Load
bots:
- name: high-load-protection
action: challenge
expression:
- load_1m > 4.0
- path.startsWith('/expensive-operation')
Geographic Restrictions
bots:
- name: verify-search-engines
action: allow
expression:
- verifyFCrDNS(remoteAddress, '.*\\.(googlebot|bingbot)\\.com$')
- name: challenge-unverified
action: challenge
expression:
- userAgent.contains('bot')
bots:
- name: suspicious-clients
action: challenge
expression:
- missingHeader(headers, 'Accept')
- missingHeader(headers, 'Accept-Language')
- missingHeader(headers, 'Accept-Encoding')
Path Segment Matching
bots:
- name: protect-api
action: challenge
expression:
- segments(path)[0] == 'api'
- segments(path)[1] in ['v1', 'v2']
Query Parameter Validation
bots:
- name: require-auth-token
action: deny
expression:
- path.startsWith('/admin')
- '!query.contains("token")'
Compilation and Execution
Expressions are compiled at startup for performance:
func Compile(env *cel.Env, src string) (cel.Program, error) {
intermediate, iss := env.Compile(src)
if iss != nil {
return nil, iss.Err()
}
ast, iss := env.Check(intermediate)
if iss != nil {
return nil, iss.Err()
}
return env.Program(
ast,
cel.EvalOptions(
// optimize regular expressions right now instead of on the fly
cel.OptOptimize,
),
)
}
Source: lib/policy/expressions/environment.go:237-255
Request Activation
CEL variables are resolved from HTTP requests:
func (cr *CELRequest) ResolveName(name string) (any, bool) {
switch name {
case "remoteAddress":
return cr.Header.Get("X-Real-Ip"), true
case "contentLength":
return cr.ContentLength, true
case "host":
return cr.Host, true
case "method":
return cr.Method, true
case "userAgent":
return cr.UserAgent(), true
case "path":
return cr.URL.Path, true
case "query":
return expressions.URLValues{Values: cr.URL.Query()}, true
case "headers":
return expressions.HTTPHeaders{Header: cr.Header}, true
case "load_1m":
return expressions.Load1(), true
case "load_5m":
return expressions.Load5(), true
case "load_15m":
return expressions.Load15(), true
default:
return nil, false
}
}
Source: lib/policy/celchecker.go:61-88
Best Practices
- Validate at startup: CEL expressions are compiled during config parsing to catch errors early
- Use standard library: Leverage CEL’s built-in string, list, and map functions
- Cache DNS results: DNS functions use a TTL-based cache to avoid repeated lookups
- Combine conditions: Use logical operators (
&&, ||, !) to build complex rules
- Test expressions: Invalid CEL syntax causes Anubis to refuse to start
- Mind performance: DNS lookups and regex matching add latency; use judiciously
Threshold Expressions
Thresholds use a simplified CEL environment with only the weight variable:
func ThresholdEnvironment() (*cel.Env, error) {
return New(
cel.Variable("weight", cel.IntType),
)
}
Example threshold configuration:
thresholds:
- name: high-weight-challenge
expression: weight > 100
challenge:
algorithm: fast
difficulty: 10
Source: lib/policy/expressions/environment.go:198-202