The HTML report
cf.to_html("report.html") writes a single self-contained HTML file (no server, no external assets) documenting one pipeline run. It is the primary per-run artifact.
import conformare as cf
cf.trackSpark()
cf.set_profiles({"*": [cf.rowCount], "filter": [cf.histogram(columns="all")]})
# ... your pipeline ...
cf.to_html("report.html", title="Customer scoring")
What’s in it
- Overview — KPI cards (nodes, operations, columns, sources, sinks, sensitive columns, risks, expectations) and the
describe_process()description. - Process diagram — the lineage graph, with a settings menu:
- Layout: Sequential (the default — disconnected chains stacked by execution order), Grouped, Layered, Swimlane.
- Compress chained operations — roll a run of steps on one dataframe into a single expandable node.
- Compress context-linked nodes — contract a
describe()group into one node. - Show function calls — annotate nodes with the functions they pass through.
- Expand operation details — show each operation’s captured expression inline (and, for collapsed chains / contracted contexts, every folded step plus the functions used).
- Edge style, zoom, light/dark.
- Column index — every column and the nodes it appears in (long lists collapse to the first few with “… and N more”).
- Data sensitivity — flagged columns, their tags, and whether they reach a written output.
- Risk register — one row per risk: severity, mitigation, owner, and where it occurs.
- Context register — the
describe()contexts, their purpose, owner and risks. - Created columns — a catalog of created columns and a diagram showing every existing column that contributes to each one.
- Node profiles — per-node detail (rows, size, columns, distributions, expectations), collapsible by context / function / chain, with a “Hide non-profiled” toggle (on by default) so the section focuses on what was actually measured.
Other per-run exports
| Function | Output |
|---|---|
cf.to_html(path) | the interactive report above |
cf.to_json(path) | the full model (build_model) as JSON, for your own tooling |
cf.to_mermaid() | a Mermaid lineage diagram (embed in Markdown / wikis) |
cf.to_risk_checklist(path) | a formal, sign-off-ready Markdown risk checklist |
cf.build_model(cf.store) | the in-memory model dict the report renders from |
These describe a single run. To track results across runs, record each run to a fleet store — see Fleet recording.