Ingestion Overview
All supported input paths and how they reach the same batch evaluation engine.
BlazeRules evaluates batches. Every input path either already is a batch, or collects records until a batch is ready.
flowchart LR json["JSON / NDJSON"] --> engine["RuleEngine"] arrow["Arrow / PyArrow"] --> engine kafka["Kafka"] --> io["blazerules_io"] ipc["Arrow IPC / Avro / Protobuf"] --> io s3["S3 / local files"] --> io http["HTTP /v1/logs"] --> agent["blazerules_agent"] stdin["stdin"] --> agent tail["file_tail"] --> agent k8s["Kubernetes pod logs"] --> agent io --> engine agent --> engine engine --> decisions["decisions / scores / rule counts"] engine --> dlq["dead-letter records"]
Input Matrix
| Input | API or component | Best for |
|---|---|---|
| NDJSON bytes | engine.evaluate_ndjson(bytes) | Raw event streams, JSON logs, HTTP batches. |
| JSON string list | engine.evaluate_messages(list[str]) | Small scripts and simple tests. |
| PyArrow batch | engine.evaluate_batch(batch) | Typed pipelines and fastest non-JSON path. |
| Kafka | blazerules_io.KafkaConsumer, run_stream(...) | Brokered streams with offset handling. |
| HTTP logs/events | blazerules_agent --input http | Apps POST NDJSON to a local listener. |
| stdin | blazerules_agent --input stdin | Terminal output, process logs, shell pipelines. |
| File tail | blazerules_agent --input file_tail | Pod stdout/stderr files and node-local logs. |
| Plain text logs | wrap lines as JSON first | Unstructured terminal/stdout/stderr text. |
| Kubernetes | Helm chart DaemonSet | Node-local log collection and dashboard sidecar/deployment. |
| Debezium CDC | blazerules_io.unwrap_debezium(...) | Database change streams. |
| Arrow IPC | blazerules_io.ArrowIpcDecoder | Binary Arrow frames. |
| Avro | blazerules_io.AvroDecoder | Schema-based binary events. |
| Protobuf | blazerules_io.ProtobufDecoder | Descriptor-backed binary events. |
| S3 / local files | read_ndjson_bytes, read_record_batches | Backtests, batch files, rules, lookups, and models. |
Record Shape
JSON records can be flat or nested. Rules use dotted paths:
conditions:
and:
- field: merchant.risk.score
op: gt
value: 50
- array_any:
path: items
where:
and:
- field: price
op: gt
value: 100
- field: category
op: eq
value: electronicsArrow, Avro, Protobuf, and PyArrow inputs use the same logical field names. Nested structs are projected into the same dotted namespace, so a rule does not care which wire format produced the batch.
Error Handling
Rules and lookups fail fast at load time. Ingest can be tolerant:
SKIP_AND_COUNT: skip malformed records and count/sample errors.SKIP_TO_DEAD_LETTER: write bad rows to an NDJSON dead-letter file.HARD_FAIL: throw on the first malformed row.
See Decision and DLQ Logs and Handle Dirty Data.
Recipes
POST NDJSON batches to /v1/logs.
Pipe terminal or service logs into the agent.
Tail application or pod log files.
Wrap unstructured text into JSON records.
Run the agent as a DaemonSet.
Consume, evaluate, and publish decisions.
Decode Arrow IPC, Avro, or Protobuf to Arrow batches.