Governance in dbt models

Experimental

Extending table descriptions reads governance from a table’s system comment. Many teams instead document their tables in dbt, whose schema YAML already has a meta: field for arbitrary custom metadata. Conformare can read governance – purpose, owner, business context, and risks/mitigations – straight out of those meta fields, so a team that builds tables in dbt does not need to adopt this package for their risks to reach the developers who consume the data.

The convention is a conformare: block inside meta, at the model level and (for column-scoped risks) the column level.

The YAML

version: 2

models:
  - name: sales_channel
    description: Canonical mapping of channel_id to a channel name.
    meta:
      owner: Data Engineering
      conformare:
        purpose: Single source of truth for sales channels
        contexts:
          - Channel codes were remapped in 2026; pre-2026 joins are wrong
        risks:
          - id: definition.channel_remap
            severity: high
            mitigation: see the DE wiki for the mapping
            owner: Data Engineering
    columns:
      - name: channel
        description: Human-readable channel name.
        meta:
          conformare:
            risks:
              - id: privacy.partner_confidential
                severity: medium
                note: the 'Partner' value must not be shared externally
  • meta.conformare.purpose / contexts describe what the model is for and any business rules a consumer must know.
  • meta.owner (or meta.conformare.owner) is the accountable owner.
  • risks is a list; each has an id and optional severity, mitigation, owner, note. A risk under a column’s meta.conformare is automatically scoped to that column.

This is plain dbt – meta is a first-class dbt field, so it lives in version control next to the model, surfaces in dbt docs, and needs no extra tooling on the dbt side.

Reading it

doc = cf.read_dbt_governance("models/marts/schema.yml", model="sales_channel")
# {'purpose': ..., 'owner': 'Data Engineering', 'contexts': [...], 'risks': [...]}

Without model=, it returns {model_name: doc} for every model in the file. Reading dbt YAML needs pyyaml (pip install conformare[dbt]).

Resolving by table name (the dbt manifest)

You usually know the table a pipeline reads, not which schema.yml documents it. dbt’s compiled artifact target/manifest.json (produced by dbt compile, dbt run, or dbt docs generate) is the machine-readable source of truth: it lists every model with its database relation (database.schema.alias) alongside its description and meta. Point read_dbt_governance at the manifest and look governance up by table name – no need to find the YAML file, and it covers the whole project:

doc = cf.read_dbt_governance("target/manifest.json", table="analytics.marts.sales_channel")

The table is matched flexibly against the model name, schema.table, db.schema.table, or the quoted relation, so any form your pipeline uses resolves. (The HTML docs site and the markdown it renders are for humans; the manifest is the one to read programmatically. Reading JSON needs no extra dependency.)

Surfacing it while you track a pipeline

Ingest a model’s governance into the fleet as static source risks – a one-off (or scheduled) sync of your dbt project:

cf.record_source_risk_from_dbt("models/marts/schema.yml", model="sales_channel")
# resolve by table name from the manifest (location = the relation):
cf.record_source_risk_from_dbt("target/manifest.json", table="analytics.marts.sales_channel")
# or every model in the source:
cf.record_source_risk_from_dbt("target/manifest.json")

From then on the dbt-declared risks behave like any other upstream risk: with cf.configure_store(..., warn_on_source=True) a pipeline that reads sales_channel is warned on load, cf.check_upstream_risks(["sales_channel"]) lists them, and they appear in the fleet dashboard’s inherited-risk section – so the consumer sees the risks, mitigations and design notes the dbt model’s authors wrote.

Matching the table. Conformare keys risk on the table location a pipeline reads. By default the dbt model name is used as the location; if your pipelines reference the table by a fully-qualified relation, pass location="catalog.schema.sales_channel" (single model) so the locations line up.

Experimental. The meta.conformare shape and the reader API may change.


This site uses Just the Docs, a documentation theme for Jekyll.