Skip to main content

Lineage tags coverage across systems

Measure and improve tag coverage and propagation across systems using Lakehouse. Use lineage and tag metadata to understand how sensitive or governed data flows from upstream sources to all downstream tables, views, and dashboards.

This page provides SQL templates and guidance for analyzing tag propagation and coverage gaps across your data assets.

Before you begin

Before you run these queries, make sure:

You can use SQL queries directly on your lineage and asset lookup tables in the Lakehouse to analyze tag propagation performance and coverage across systems.

Analyze tag propagation across lineage

Trace how governance tags flow from tagged upstream assets to all downstream consumers. Use this to identify where tags are missing, partially propagated, or replaced, and to calculate a tag inheritance percentage at each hop.

Use this query to:

  • Identify where expected governance tags such as PII, Confidential, Finance, or Cost_center are dropped along the lineage path.
  • Quantify tag inheritance percentage at each hop so you can measure coverage quality.
  • Highlight downstream assets that introduce new tags or have no tags at all, which are prime candidates for remediation or review.
USE ATLAN_GOLD;

-- ============================================
-- TAG PROPAGATION ANALYSIS
-- Trace how governance tags flow through downstream lineage
-- ============================================

WITH tagged_sources AS (
-- Identify tagged source assets
SELECT
GUID,
ASSET_NAME,
ASSET_TYPE,
ASSET_QUALIFIED_NAME,
CONNECTOR_NAME,
TAGS,
CERTIFICATE_STATUS,
OWNER_USERS,
TO_TIMESTAMP_LTZ(UPDATED_AT / 1000) AS LAST_UPDATED
FROM ASSETS
WHERE
STATUS = 'ACTIVE'
AND CONNECTOR_NAME = 'snowflake'
AND TAGS IS NOT NULL
AND ARRAY_SIZE(TAGS) > 0
AND (
ARRAY_CONTAINS('PII'::VARIANT, TAGS)
OR ARRAY_CONTAINS('Confidential'::VARIANT, TAGS)
OR ARRAY_CONTAINS('Finance'::VARIANT, TAGS)
OR ARRAY_CONTAINS('Cost_center'::VARIANT, TAGS)
)
),
downstream_lineage AS (
-- Expand downstream lineage
SELECT
TS.GUID AS SOURCE_GUID,
TS.ASSET_NAME AS SOURCE_NAME,
TS.ASSET_TYPE AS SOURCE_TYPE,
TS.CONNECTOR_NAME AS SOURCE_CONNECTOR,
TS.TAGS AS SOURCE_TAGS,
L.RELATED_GUID AS DOWNSTREAM_GUID,
L.RELATED_NAME AS DOWNSTREAM_NAME,
L.RELATED_TYPE AS DOWNSTREAM_TYPE,
L.LEVEL AS LINEAGE_DEPTH,
RA.TAGS AS DOWNSTREAM_TAGS,
RA.CONNECTOR_NAME AS DOWNSTREAM_CONNECTOR
FROM tagged_sources TS
INNER JOIN LINEAGE L
ON TS.GUID = L.START_GUID
LEFT JOIN ASSETS RA
ON L.RELATED_GUID = RA.GUID
WHERE
L.DIRECTION = 'DOWNSTREAM'
)
SELECT
SOURCE_NAME,
SOURCE_TYPE,
SOURCE_CONNECTOR,
SOURCE_TAGS,
LINEAGE_DEPTH,
DOWNSTREAM_NAME,
DOWNSTREAM_TYPE,
DOWNSTREAM_CONNECTOR,
DOWNSTREAM_TAGS,
CASE
WHEN DOWNSTREAM_TAGS IS NULL
OR ARRAY_SIZE(DOWNSTREAM_TAGS) = 0
THEN 'NO TAGS - Complete tag drop'
WHEN ARRAYS_OVERLAP(SOURCE_TAGS, DOWNSTREAM_TAGS)
THEN 'TAGS PROPAGATED - Some or all tags present'
ELSE 'DIFFERENT TAGS - Tagged but not inherited from source'
END AS TAG_PROPAGATION_STATUS,
ARRAY_EXCEPT(
SOURCE_TAGS,
COALESCE(DOWNSTREAM_TAGS, ARRAY_CONSTRUCT())
) AS DROPPED_TAGS,
CASE
WHEN DOWNSTREAM_TAGS IS NOT NULL
AND ARRAY_SIZE(DOWNSTREAM_TAGS) > 0
THEN ROUND(
(ARRAY_SIZE(
ARRAY_INTERSECTION(SOURCE_TAGS, DOWNSTREAM_TAGS)
) * 100.0) / ARRAY_SIZE(SOURCE_TAGS),
2
)
ELSE 0
END AS TAG_INHERITANCE_PCT
FROM downstream_lineage
ORDER BY
SOURCE_NAME,
LINEAGE_DEPTH,
DOWNSTREAM_NAME;
Sample output
SOURCE_NAMESOURCE_TYPESOURCE_TAGSLINEAGE_DEPTHDOWNSTREAM_NAMEDOWNSTREAM_TYPEDOWNSTREAM_CONNECTORTAG_PROPAGATION_STATUSTAG_INHERITANCE_PCT
ATableAttribute, Cost_center, Confidential1fraud_detection_model_v1DatabricksAIModelVersiondatabricksTAGS PROPAGATED - Some or all tags present100.00
ABC_LOT_IDColumnAttribute, Customer Service, Cost_center1ABC_LOT_IDPowerBIColumnpowerbiTAGS PROPAGATED - Some or all tags present100.00
ABOUT_MEColumnCustomer Service, Attribute, Confidential1ABOUT_MEColumnsnowflakeTAGS PROPAGATED - Some or all tags present100.00
ACCOUNTTablePII, Finance, Confidential, Customer Service1ACCOUNT (SALESFORCE.ACCOUNT)TableauDatasourcetableauTAGS PROPAGATED - Some or all tags present77.78
ACCOUNTTablePII, Finance, Confidential, Customer Service1STG_SALESFORCE__ACCOUNT_TMPViewsnowflakeTAGS PROPAGATED - Some or all tags present77.78
ACCOUNTTablePII, Finance, Confidential, Customer Service2STG_SALESFORCE__ACCOUNTTablesnowflakeTAGS PROPAGATED - Some or all tags present88.89
ACCOUNTTablePII, Finance, Confidential, Customer Service2Sales Operation DashboardTableauDashboardtableauTAGS PROPAGATED - Some or all tags present77.78

Common filters and customization examples

Use these filter examples to customize the tag propagation query for your specific needs. Add these filters to the tagged_sources CTE WHERE clause or the final SELECT statement as indicated.

-- ============================================
-- COMMON FILTERS FOR SNOWFLAKE
-- Add these to customize your query
-- ============================================

-- Filter by specific connectors
-- Add to tagged_sources WHERE clause:
AND CONNECTOR_NAME IN ('snowflake', 'bigquery', 'databricks')

-- Filter by specific governance tags
-- Replace the default tag filter with your organization's tags:
AND (
ARRAY_CONTAINS('Sensitive'::VARIANT, TAGS)
OR ARRAY_CONTAINS('GDPR'::VARIANT, TAGS)
OR ARRAY_CONTAINS('HIPAA'::VARIANT, TAGS)
)

-- Filter by certification status
-- Add to tagged_sources WHERE clause:
AND CERTIFICATE_STATUS = 'VERIFIED'

-- Filter by specific asset types
-- Add to tagged_sources WHERE clause:
AND ASSET_TYPE IN ('Table', 'View', 'MaterializedView')

-- Filter by time window (last 90 days)
-- Add to tagged_sources WHERE clause:
AND TO_TIMESTAMP_LTZ(UPDATED_AT / 1000) >= DATEADD(day, -90, CURRENT_DATE())

-- Filter by lineage depth
-- Add to final SELECT WHERE clause:
WHERE LINEAGE_DEPTH <= 3 -- Only analyze up to 3 hops downstream