Recipe: Binary Format Ingestion

Use blazerules_io decoders for binary event formats without converting through JSON.

Binary decoders produce Arrow RecordBatch objects and feed evaluate_batch. They do not convert through JSON.

Arrow IPC

import blazerules
import blazerules_io

engine = blazerules.RuleEngine()
engine.load_rules("rules.yaml")

decoder = blazerules_io.ArrowIpcDecoder()
batch = decoder.decode_batch([frame_bytes])
result = engine.evaluate_batch(batch)
CLI equivalent: run the same Arrow IPC decoder from a shell
python - <<'PY'
import pathlib
import blazerules
import blazerules_io

engine = blazerules.RuleEngine()
engine.load_rules("rules.yaml")

frame_bytes = pathlib.Path("batch.arrow").read_bytes()
batch = blazerules_io.ArrowIpcDecoder().decode_batch([frame_bytes])
result = engine.evaluate_batch(batch)
print(result.n_records, result.n_matched)
PY

Avro

decoder = blazerules_io.AvroDecoder(schema_json)
batch = decoder.decode_batch([avro_bytes])
result = engine.evaluate_batch(batch)
CLI equivalent: decode Avro from a shell
python - <<'PY'
import pathlib
import blazerules
import blazerules_io

engine = blazerules.RuleEngine()
engine.load_rules("rules.yaml")

schema_json = pathlib.Path("transaction.avsc").read_text()
avro_bytes = pathlib.Path("record.avrobin").read_bytes()
batch = blazerules_io.AvroDecoder(schema_json).decode_batch([avro_bytes])
result = engine.evaluate_batch(batch)
print(result.n_records, result.n_matched)
PY

Protobuf

decoder = blazerules_io.ProtobufDecoder(descriptor_set_bytes, "package.Transaction")
batch = decoder.decode_batch([proto_bytes])
result = engine.evaluate_batch(batch)
CLI equivalent: decode Protobuf from a shell
python - <<'PY'
import pathlib
import blazerules
import blazerules_io

engine = blazerules.RuleEngine()
engine.load_rules("rules.yaml")

descriptor = pathlib.Path("descriptor.pb").read_bytes()
record = pathlib.Path("transaction.pb").read_bytes()
batch = blazerules_io.ProtobufDecoder(descriptor, "package.Transaction").decode_batch([record])
result = engine.evaluate_batch(batch)
print(result.n_records, result.n_matched)
PY

Nested Data

Nested structs use dotted rule fields:

conditions:
  field: merchant.risk.score
  op: gt
  value: 50

Arrays of objects use array_any:

conditions:
  array_any:
    path: items
    where:
      and:
        - field: price
          op: gt
          value: 100
        - field: category
          op: eq
          value: electronics