C++ API

Embed the BlazeRules C++ core directly — construct the engine, load rules, evaluate batches, and read decisions without Python in the hot path.

The Python module blazerules is a thin binding over a C++20 core. When you don't want Python anywhere near the hot path — you ship a single native binary, link into an existing C++ service, or need the lowest possible per-batch overhead — you can embed that core directly. This page is the map: which headers to include, the engine surface that matters, and how the C++ names line up with the Python ones.

📘

Same engine, two front doors

The C++ core and the Python module run the identical compiler, kernels, schema inference, and decision logic. Anything you read about evaluation semantics in Core Concepts or the Python API holds here — only the call syntax and a few enum names change.

Headers

All core headers live under include/blazerules/:

HeaderWhat it gives you
engine.hRuleEngine, EngineConfig, BacktestConfig, RuleFileFormat
schema.hColumnType, FieldSpec, BlazeRulesSchema
batch_result.hBatchResult, BacktestReport, timing and per-rule structures
conflict.hConflictReport — returned by load_rules
rule_spec.hRule, action, and operator types shared across the API

engine.h includes <arrow/api.h>, so an Apache Arrow development install must be on your include path.

The shape of the engine

A RuleEngine compiles a YAML rule set once into immutable execution plans, then evaluates batches of records against it. Each call returns a BatchResult carrying decisions, scores, risk bands, the winning rule per record, and — when you ask for them — per-rule bitmasks. Schema inference works exactly as in Python: pass a schema up front, or let the first evaluated batch bind it.

#include "blazerules/engine.h"

EngineConfig cfg;
cfg.output_detail = EngineConfig::OUTPUT_DECISIONS;   // see enum note below

RuleEngine engine(cfg);                               // non-copyable; own it / pass by reference
ConflictReport report = engine.load_rules("rules.yaml");
// inspect `report` before serving — load is strict (see below)

std::string ndjson = /* one JSON object per line */;
BatchResult r = engine.evaluate_ndjson(ndjson);

for (size_t i = 0; i < r.decisions.size(); ++i) {
    // r.decisions[i], r.scores[i], r.winning_rule_ids[i] ...
}

Constructors

  • explicit RuleEngine(EngineConfig config = {}) — default or configured engine; schema inferred on first batch.
  • RuleEngine(std::vector<FieldSpec> fields, EngineConfig config = {}) — bind a schema up front.
  • static BlazeRulesResult<std::unique_ptr<RuleEngine>> RuleEngine::create(BlazeRulesSchema schema, EngineConfig config) — factory returning a result wrapper instead of throwing.

RuleEngine is non-copyable (the copy constructor and copy assignment are deleted). Move it or pass it by reference; never copy it into a container by value.

Loading rules

ConflictReport load_rules(const std::string& rules_path);
ConflictReport load_rules_from_string(const std::string& rules_yaml_or_json,
                                      RuleFileFormat format = RuleFileFormat::YAML); // enum class RuleFileFormat { YAML, JSON }
ConflictReport reload_rules_now(const std::string& rules_path);
ConflictReport analyze_conflicts(const std::string& rules_path);
std::string    active_rule_set_version() const;
🚧

load_rules returns a report, not void

Activation is strict: malformed YAML, duplicate rule IDs, or references to missing lookups fail before the new rule set goes live. load_rules hands back a ConflictReport describing overlaps and conflicts. Check it before you start serving decisions rather than discovering a bad rule set at evaluation time.

Evaluating batches

BatchResult evaluate_ndjson(std::string_view ndjson_bytes);
BatchResult evaluate_ndjson_padded(std::string_view ndjson_bytes);   // input already simdjson-padded
BatchResult evaluate_ndjson_file(const std::string& path);           // mmap, zero-copy replay
BatchResult evaluate_record_batch(const std::shared_ptr<arrow::RecordBatch>& batch);
BatchResult evaluate_batch(const std::shared_ptr<arrow::RecordBatch>& batch); // inline alias for evaluate_record_batch
BatchResult evaluate_messages(const std::vector<std::string>& messages);
BatchResult evaluate_message_views(const std::vector<std::string_view>& messages);

Prefer evaluate_batch when upstream data is already typed Arrow, and evaluate_ndjson for JSON streams. Each evaluator has an _into(..., BatchResult& out) variant that reuses an existing result object to avoid per-batch allocation in tight loops.

Advanced: sharding and partition affinity

For window-heavy streaming workloads you can keep entity affinity across shards: create_shards(int) returns owned per-shard engines, and the evaluate_partition(int partition_id, ...) overloads evaluate a single partition's records. Use these only when a profiler tells you a single engine's window state is the bottleneck.

Models, hot reload, schema, backtest

void register_model(const std::string& name, const std::string& path);  // ONNX
int  num_models() const;

void enable_hot_reload(const std::string& rules_file_path,
                       std::chrono::seconds poll_interval = std::chrono::seconds(5));
void stop_hot_reload();
HotReloadStatus hot_reload_status() const;

const BlazeRulesSchema& schema() const;
bool schema_bound() const;
SchemaState schema_state() const;   // enum class SchemaState { UNBOUND, INFERRED_BOUND, USER_BOUND }

BacktestReport backtest(const BacktestConfig& config);
🚧

ONNX is build-gated

model_score rules and register_model(...) require an ONNX-enabled build (BLAZERULES_ENABLE_ONNX, default ON). In a build with ONNX off, model_score rules are rejected at compile time and register_model throws. See Backtesting a Candidate for the backtest(...) workflow.

The methods above are the load → evaluate → read surface most embedders need. engine.h also exposes reset_window_state(), num_window_channels(), in-process metrics (enable_metrics(), reset_metrics(), metrics_counters(), metrics_gauges(), metrics_histograms()), stats(), and partition-affine evaluation helpers.

C++ vs Python, side by side

   import blazerules

   config = blazerules.EngineConfig()
   config.output_detail = blazerules.OutputDetail.DECISIONS

   engine = blazerules.RuleEngine(config)
   engine.load_rules("rules.yaml")

   result = engine.evaluate_ndjson(ndjson_bytes)
   for decision in result.decisions:
       ...

Enum names differ between the two front doors

The EngineConfig modes are nested unscoped enums inside the struct, so you reference them as EngineConfig::<NAME>. The constant names are not identical to the Python ones:

SettingPythonC++
Output detailOutputDetail.DECISIONSEngineConfig::OUTPUT_DECISIONS
Output detailOutputDetail.BITMASKSEngineConfig::OUTPUT_BITMASKS
Ingest errorsIngestErrorMode.SKIP_AND_COUNTEngineConfig::SKIP_AND_COUNT
Type mismatchTypeMismatchMode.NULL_ON_TYPE_ERROREngineConfig::NULL_ON_TYPE_ERROR
📘

The C++ default is OUTPUT_BITMASKS

In C++, EngineConfig::output_detail defaults to OUTPUT_BITMASKS. If you only need decisions and scores, set it to EngineConfig::OUTPUT_DECISIONS explicitly — that is the cheaper path and matches the Python guidance.

Reading a BatchResult in C++

The data is the same as the Python result, but a few members that are methods in Python are plain fields in C++:

DataPythonC++
Records / matchesn_records, n_matchedn_records, n_matched (fields)
Decisionsdecisions, scores, winning_rule_idssame field names
Grouped indicesgrouped_decision_indices() (method)grouped_decision_indices (member field)
Per-rule countsmatch_countsrule_match_counts (member field)
Matched rowsmatched_indicesmatched_record_indices (member field)
Timingtiming_mstiming_ms() (methodunordered_map<string,double>)
BatchResult r = engine.evaluate_ndjson(ndjson);

int matched = r.n_matched;
const auto& approve_rows = r.grouped_decision_indices["approve"];   // member access
auto timings = r.timing_ms();                                       // method in C++ too
double total_ms = timings["total"];

Zero-copy buffer helpers (rule_bitmask_buffer(id), matched_indices_buffer(), decision_codes_buffer(), grouped_indices_buffer(decision)) expose the underlying memory as arrow::Buffer when you need to hand results to another Arrow consumer without copying.

Linking against the core

The core build/link target is blazerules_core. In your CMakeLists.txt:

# either find an installed package...
# find_package(blazerules CONFIG REQUIRED)
# ...or add the repo as a subdirectory
add_subdirectory(third_party/blazerules)

add_executable(app main.cpp)
target_link_libraries(app PRIVATE blazerules_core)

The install export namespace is blazerules::, so installed consumers can link blazerules::blazerules_core when using the generated CMake package. In-tree builds link the target as blazerules_core.

Next steps