Production YAML Guide
Learn the practical structure of a BlazeRules YAML file, then copy a full multi-instance production example.
This guide is the practical version of the rule reference. It shows how to write a YAML file that a production process can load directly.
The file layout
A complete YAML file can contain both rule semantics and local agent wiring:
| Section | Used by | Purpose |
|---|---|---|
schema_version | engine | Rule file compatibility version. Use "2.1". |
fields | engine | Optional type hints. Omit most fields and let BlazeRules infer them from the first batch. |
lookups | engine | Named CSV lookup sets for in_lookup / not_in_lookup. |
decisions | engine | Default decision and precedence. |
ruleset | engine | Rule metadata and rules. |
instances | blazerules_agent only | Local input/output instances for HTTP, file tail, or stdin. |
instances is ignored by the in-process Python and C++ rule engine. The agent reads it to start multiple inputs that each load rules and write decisions.
Field hints
Field hints are optional. Use them when a field is important enough that you want stable typing before the first batch arrives.
fields:
event_id: {type: string, nullable: false}
card_token: {type: entity_key, nullable: false}
amount: {type: float32, nullable: false}
country_code:
type: categorical
values: [US, GB, IN, DE]
event_ts_ms: {type: timestamp_ms, nullable: false}Allowed field types:
float32, float64, int32, int64, categorical, entity_key, timestamp_ms, boolean, string.
Use values: only when you intentionally want a closed categorical set. Most fields should be inferred.
Lookups
Lookups are named files resolved relative to the YAML file. For S3-hosted YAML, use exact s3://bucket/key paths or relative keys under the same prefix.
lookups:
blocked_merchants:
type: string_set
path: lookups/blocked_merchants.csv
risky_bins:
type: int_set
path: lookups/risky_bins.csv
vpn_ranges:
type: ipv4_cidr_set
path: lookups/vpn_ranges.csvSupported lookup types:
| Type | CSV column | Used with |
|---|---|---|
string_set | value | STRING, CATEGORICAL, ENTITY_KEY |
int_set | value | integer numeric fields |
ipv4_cidr_set | cidr | IPv4 string fields |
Decisions
Decision policy turns rule matches into one final per-row decision.
decisions:
default: APPROVE
precedence: [BLOCK, REVIEW, FLAG, APPROVE]
risk_bands:
LOW: [0, 29]
MEDIUM: [30, 69]
HIGH: [70, 1000]Rules can set action, weight, priority, severity, and reason_code. Production routing should normally consume result.grouped_decision_indices() or compact decision logs instead of scanning Python strings one row at a time.
Condition examples by family
Numeric and range
- {field: amount, op: gt, value: 1000}
- {field: amount, op: between_including, value: [100, 5000]}
- {field: amount, op: gt_field, other_field: historical_avg_amount}Categorical and null handling
- {field: country_code, op: in, values: [US, GB, IN]}
- {field: optional_note, op: is_empty}
- {field: description, op: is_not_empty}Strings and regex
- {field: user_agent, op: contains, value: Mobile}
- {field: transaction_description, op: regex, value: "payment|checkout"}
- {field: user_agent, op: not_regex, value: "bot|crawler"}Arrays, flags, and nested arrays of objects
- {field: tags, op: contains_any, values: [vip, trusted]}
- {field: signal_flags, op: flags_any, mask: 4}
- array_any:
path: items
where:
and:
- {field: price, op: gt, value: 100}
- {field: category, op: eq, value: electronics}Inside array_any, field names are scoped to the same array element. The example only matches when one item has both price > 100 and category = electronics.
Network, temporal, and geo
- {field: ip_address, op: ip_in_subnet, value: "10.0.0.0/8"}
- {field: event_ts_ms, op: within_last, value: 86400}
- {field: event_ts_ms, op: day_of_week_in, values: [1, 2, 3, 4, 5]}
- op: distance_gt
lat_field: billing.lat
lon_field: billing.lon
other_lat_field: shipping.lat
other_lon_field: shipping.lon
value: 50Lookups, windows, SQL, ML, and vectors
- {field: merchant_id, op: in_lookup, lookup: blocked_merchants}
- window:
entity_field: card_token
function: count
duration: 10m
op: gt
value: 3
- sql: "amount > 1000 AND country_code IN ('US', 'GB')"
- model_score:
model: fraud_logreg
features: [amount, account_age_days, merchant_risk_score]
op: gt
value: 0.8
- vector_distance:
fields: [embedding_0, embedding_1, embedding_2, embedding_3]
metric: cosine
reference: [0.1, 0.2, 0.3, 0.4]
op: gt
value: 0.7Complete minimal production YAML
This single file can be loaded by Python/C++ as rules, and by blazerules_agent as a multi-instance local runtime.
schema_version: "2.1"
fields:
event_id: {type: string, nullable: false}
card_token: {type: entity_key, nullable: false}
amount: {type: float32, nullable: false}
country_code:
type: categorical
values: [US, GB, IN, DE]
device_type:
type: categorical
values: [ios, android, web, emulator]
event_ts_ms: {type: timestamp_ms, nullable: false}
ip_address: {type: string}
lookups:
blocked_merchants:
type: string_set
path: lookups/blocked_merchants.csv
risky_bins:
type: int_set
path: lookups/risky_bins.csv
vpn_ranges:
type: ipv4_cidr_set
path: lookups/vpn_ranges.csv
decisions:
default: APPROVE
precedence: [BLOCK, REVIEW, FLAG, APPROVE]
risk_bands:
LOW: [0, 29]
MEDIUM: [30, 69]
HIGH: [70, 1000]
ruleset:
name: Payments Production Rules
version: "2026.06.23"
rules:
- id: high_amount_emulator
action: BLOCK
severity: HIGH
priority: 100
weight: 60
reason_code: HIGH_AMOUNT_EMULATOR
conditions:
and:
- {field: amount, op: gt, value: 1000}
- {field: device_type, op: eq, value: emulator}
- id: blocked_merchant_or_vpn
action: REVIEW
severity: MEDIUM
priority: 80
weight: 40
reason_code: MERCHANT_OR_NETWORK_RISK
conditions:
or:
- {field: merchant.id, op: in_lookup, lookup: blocked_merchants}
- {field: ip_address, op: in_lookup, lookup: vpn_ranges}
- id: expensive_electronics_item
action: FLAG
severity: MEDIUM
weight: 25
conditions:
array_any:
path: items
where:
and:
- {field: price, op: gt, value: 100}
- {field: category, op: eq, value: electronics}
- id: card_velocity_10m
action: REVIEW
severity: HIGH
weight: 50
conditions:
window:
entity_field: card_token
function: count
duration: 10m
op: gt
value: 3
- id: impossible_shipping_distance
action: REVIEW
severity: HIGH
weight: 45
conditions:
op: distance_gt
lat_field: billing.lat
lon_field: billing.lon
other_lat_field: shipping.lat
other_lon_field: shipping.lon
value: 500
instances:
- name: payments-http
rules: rules.yaml
batch_size: 4096
flush_ms: 50
service: payments-api
source: http-json
input:
type: http
host: 127.0.0.1
port: 9480
output:
type: ndjson
path: decisions-payments.ndjson
dedupe:
enabled: true
key_fields: [event_id]
ttl_seconds: 86400
- name: checkout-pod-tail
rules: rules.yaml
batch_size: 2048
flush_ms: 250
service: checkout
source: pod-stdout
input:
type: file_tail
path: /var/log/containers/checkout.log
output:
type: ndjson
path: decisions-checkout.ndjson
- name: replay-stdin
rules: rules.yaml
batch_size: 8192
flush_ms: 1000
service: replay
source: stdin
input:
type: stdin
output:
type: stdoutRun the multi-instance file:
blazerules_agent --config rules.yamlPython equivalent: load the rule semantics from the same file
import blazerules
engine = blazerules.RuleEngine()
engine.load_rules("rules.yaml")
payload = b'{"event_id":"e1","card_token":"c1","amount":1200,"device_type":"emulator","country_code":"US","event_ts_ms":1782150000000}\n'
result = engine.evaluate_ndjson(payload)
print(result.decisions)Common production mistakes
- Do not put credentials in YAML. Use environment variables, profiles, or platform secrets.
- Do not use
instances:as a replacement for Kafka partitioning. It is a local agent convenience. - Do not enable
OutputDetail.BITMASKSfor routing unless you need per-rule masks. - Do not rely on inferred types for critical fields that may drift between producers.
- Keep lookup CSV paths stable. Missing lookup files fail rule activation.