Reconcile
Run a configured pair and understand the output flags that control format, streaming, and audit behavior.
Running a reconciliation is a single command once your config is valid. The command streams the left source against an indexed right source and emits every match, difference, and unmatched row to a file or stdout. The flags you choose here determine how the output is structured, how much memory the run uses, and whether the result is reproducible.
What you'll learn
By the end of this guide, you know:
- how to run a basic reconciliation and write results to a file,
- when to use streaming formats and why they matter for large files,
- and how to override input files per run without editing your config.
How it works
Reconify indexes the right source first, then streams the left source row by row and emits events as it goes. See Overview for the full pipeline and Matching for how individual rows are classified.
Command
reconify reconcile \
--config reconify.yaml \
--pair bank_vs_stripe \
--out results.jsonSteps
Run a basic reconciliation
--pair matches a key under pairs in your config. --out writes to a file. Omit it or use -o - to write to stdout.
reconify reconcile \
--config reconify.yaml \
--pair bank_vs_stripe \
--out results.jsonChoose an output format
Pass --format to control how results are structured. The default is json.
reconify reconcile \
--config reconify.yaml \
--pair bank_vs_stripe \
--format ndjson \
--out results.ndjson| Format | When to use |
|---|---|
json | Default. Full result as a single JSON object. Good for files under ~500k rows. |
json-stream | Each section emitted as a separate JSON object. Lower peak memory. |
ndjson | One event per line. Best for piping into jq or downstream tools. |
csv | Flat row-per-event. Best for loading into spreadsheets or databases. |
table | Human-readable terminal output. Not suitable for machine consumption. |
For files above roughly 500k rows, prefer json-stream, ndjson, or csv.
Override input files without editing the config
When you need to run a specific pair of files rather than the glob from your config:
reconify reconcile \
--config reconify.yaml \
--pair bank_vs_stripe \
--left-file data/bank/january.csv \
--right-file data/stripe/january.csv \
--out results.jsonThis is useful in CI pipelines where filenames include a date or run ID.
Enable progress logging for large runs
reconify reconcile \
--config reconify.yaml \
--pair bank_vs_stripe \
--format ndjson \
--progress \
--progress-every 1000000 \
--out results.ndjsonProgress logs go to stderr, not the output file, so they don't interfere with piped output.
Control the token-match buffer
When name_mode: "tokens" is set in your pair config, unmatched rows are buffered for token
matching after reference matching completes.
reconify reconcile \
--config reconify.yaml \
--pair bank_vs_stripe \
--max-token-buffer 100000Set --max-token-buffer 0 for unlimited buffering, but only when you have enough memory for the
full unmatched set.
Verify it worked
Open results.json (or pipe NDJSON through jq) and check the summary section first:
{
"type": "summary",
"matched_count": 842,
"unmatched_left_count": 3,
"unmatched_right_count": 1,
"amount_diff_count": 0,
"timing_diff_count": 2
}A healthy run has high matched_count and low unmatched/diff counts. If unmatched_left_count or unmatched_right_count is unexpectedly high, read the unmatched rows and check for reference format mismatches, the most common cause. See Read Results for the full investigation workflow.