C++ API
Embed the BlazeRules C++ core directly — construct the engine, load rules, evaluate batches, and read decisions without Python in the hot path.
The Python module blazerules is a thin binding over a C++20 core. When you don't want Python anywhere near the hot path — you ship a single native binary, link into an existing C++ service, or need the lowest possible per-batch overhead — you can embed that core directly. This page is the map: which headers to include, the engine surface that matters, and how the C++ names line up with the Python ones.
Same engine, two front doorsThe C++ core and the Python module run the identical compiler, kernels, schema inference, and decision logic. Anything you read about evaluation semantics in Core Concepts or the Python API holds here — only the call syntax and a few enum names change.
Headers
All core headers live under include/blazerules/:
| Header | What it gives you |
|---|---|
engine.h | RuleEngine, EngineConfig, BacktestConfig, RuleFileFormat |
schema.h | ColumnType, FieldSpec, BlazeRulesSchema |
batch_result.h | BatchResult, BacktestReport, timing and per-rule structures |
conflict.h | ConflictReport — returned by load_rules |
rule_spec.h | Rule, action, and operator types shared across the API |
engine.h includes <arrow/api.h>, so an Apache Arrow development install must be on your include path.
The shape of the engine
A RuleEngine compiles a YAML rule set once into immutable execution plans, then evaluates batches of records against it. Each call returns a BatchResult carrying decisions, scores, risk bands, the winning rule per record, and — when you ask for them — per-rule bitmasks. Schema inference works exactly as in Python: pass a schema up front, or let the first evaluated batch bind it.
#include "blazerules/engine.h"
EngineConfig cfg;
cfg.output_detail = EngineConfig::OUTPUT_DECISIONS; // see enum note below
RuleEngine engine(cfg); // non-copyable; own it / pass by reference
ConflictReport report = engine.load_rules("rules.yaml");
// inspect `report` before serving — load is strict (see below)
std::string ndjson = /* one JSON object per line */;
BatchResult r = engine.evaluate_ndjson(ndjson);
for (size_t i = 0; i < r.decisions.size(); ++i) {
// r.decisions[i], r.scores[i], r.winning_rule_ids[i] ...
}Constructors
explicit RuleEngine(EngineConfig config = {})— default or configured engine; schema inferred on first batch.RuleEngine(std::vector<FieldSpec> fields, EngineConfig config = {})— bind a schema up front.static BlazeRulesResult<std::unique_ptr<RuleEngine>> RuleEngine::create(BlazeRulesSchema schema, EngineConfig config)— factory returning a result wrapper instead of throwing.
RuleEngine is non-copyable (the copy constructor and copy assignment are deleted). Move it or pass it by reference; never copy it into a container by value.
Loading rules
ConflictReport load_rules(const std::string& rules_path);
ConflictReport load_rules_from_string(const std::string& rules_yaml_or_json,
RuleFileFormat format = RuleFileFormat::YAML); // enum class RuleFileFormat { YAML, JSON }
ConflictReport reload_rules_now(const std::string& rules_path);
ConflictReport analyze_conflicts(const std::string& rules_path);
std::string active_rule_set_version() const;
load_rulesreturns a report, notvoidActivation is strict: malformed YAML, duplicate rule IDs, or references to missing lookups fail before the new rule set goes live.
load_ruleshands back aConflictReportdescribing overlaps and conflicts. Check it before you start serving decisions rather than discovering a bad rule set at evaluation time.
Evaluating batches
BatchResult evaluate_ndjson(std::string_view ndjson_bytes);
BatchResult evaluate_ndjson_padded(std::string_view ndjson_bytes); // input already simdjson-padded
BatchResult evaluate_ndjson_file(const std::string& path); // mmap, zero-copy replay
BatchResult evaluate_record_batch(const std::shared_ptr<arrow::RecordBatch>& batch);
BatchResult evaluate_batch(const std::shared_ptr<arrow::RecordBatch>& batch); // inline alias for evaluate_record_batch
BatchResult evaluate_messages(const std::vector<std::string>& messages);
BatchResult evaluate_message_views(const std::vector<std::string_view>& messages);Prefer evaluate_batch when upstream data is already typed Arrow, and evaluate_ndjson for JSON streams. Each evaluator has an _into(..., BatchResult& out) variant that reuses an existing result object to avoid per-batch allocation in tight loops.
Advanced: sharding and partition affinity
For window-heavy streaming workloads you can keep entity affinity across shards: create_shards(int) returns owned per-shard engines, and the evaluate_partition(int partition_id, ...) overloads evaluate a single partition's records. Use these only when a profiler tells you a single engine's window state is the bottleneck.
Models, hot reload, schema, backtest
void register_model(const std::string& name, const std::string& path); // ONNX
int num_models() const;
void enable_hot_reload(const std::string& rules_file_path,
std::chrono::seconds poll_interval = std::chrono::seconds(5));
void stop_hot_reload();
HotReloadStatus hot_reload_status() const;
const BlazeRulesSchema& schema() const;
bool schema_bound() const;
SchemaState schema_state() const; // enum class SchemaState { UNBOUND, INFERRED_BOUND, USER_BOUND }
BacktestReport backtest(const BacktestConfig& config);
ONNX is build-gated
model_scorerules andregister_model(...)require an ONNX-enabled build (BLAZERULES_ENABLE_ONNX, defaultON). In a build with ONNX off,model_scorerules are rejected at compile time andregister_modelthrows. See Backtesting a Candidate for thebacktest(...)workflow.
The methods above are the load → evaluate → read surface most embedders need. engine.h also exposes reset_window_state(), num_window_channels(), in-process metrics (enable_metrics(), reset_metrics(), metrics_counters(), metrics_gauges(), metrics_histograms()), stats(), and partition-affine evaluation helpers.
C++ vs Python, side by side
import blazerules
config = blazerules.EngineConfig()
config.output_detail = blazerules.OutputDetail.DECISIONS
engine = blazerules.RuleEngine(config)
engine.load_rules("rules.yaml")
result = engine.evaluate_ndjson(ndjson_bytes)
for decision in result.decisions:
...Enum names differ between the two front doors
The EngineConfig modes are nested unscoped enums inside the struct, so you reference them as EngineConfig::<NAME>. The constant names are not identical to the Python ones:
| Setting | Python | C++ |
|---|---|---|
| Output detail | OutputDetail.DECISIONS | EngineConfig::OUTPUT_DECISIONS |
| Output detail | OutputDetail.BITMASKS | EngineConfig::OUTPUT_BITMASKS |
| Ingest errors | IngestErrorMode.SKIP_AND_COUNT | EngineConfig::SKIP_AND_COUNT |
| Type mismatch | TypeMismatchMode.NULL_ON_TYPE_ERROR | EngineConfig::NULL_ON_TYPE_ERROR |
The C++ default isOUTPUT_BITMASKSIn C++,
EngineConfig::output_detaildefaults toOUTPUT_BITMASKS. If you only need decisions and scores, set it toEngineConfig::OUTPUT_DECISIONSexplicitly — that is the cheaper path and matches the Python guidance.
Reading a BatchResult in C++
BatchResult in C++The data is the same as the Python result, but a few members that are methods in Python are plain fields in C++:
| Data | Python | C++ |
|---|---|---|
| Records / matches | n_records, n_matched | n_records, n_matched (fields) |
| Decisions | decisions, scores, winning_rule_ids | same field names |
| Grouped indices | grouped_decision_indices() (method) | grouped_decision_indices (member field) |
| Per-rule counts | match_counts | rule_match_counts (member field) |
| Matched rows | matched_indices | matched_record_indices (member field) |
| Timing | timing_ms | timing_ms() (method → unordered_map<string,double>) |
BatchResult r = engine.evaluate_ndjson(ndjson);
int matched = r.n_matched;
const auto& approve_rows = r.grouped_decision_indices["approve"]; // member access
auto timings = r.timing_ms(); // method in C++ too
double total_ms = timings["total"];Zero-copy buffer helpers (rule_bitmask_buffer(id), matched_indices_buffer(), decision_codes_buffer(), grouped_indices_buffer(decision)) expose the underlying memory as arrow::Buffer when you need to hand results to another Arrow consumer without copying.
Linking against the core
The core build/link target is blazerules_core. In your CMakeLists.txt:
# either find an installed package...
# find_package(blazerules CONFIG REQUIRED)
# ...or add the repo as a subdirectory
add_subdirectory(third_party/blazerules)
add_executable(app main.cpp)
target_link_libraries(app PRIVATE blazerules_core)The install export namespace is blazerules::, so installed consumers can link blazerules::blazerules_core when using the generated CMake package. In-tree builds link the target as blazerules_core.