Lineage full export
Use Lakehouse to export a unified view of lineage that you can reuse across impact analysis, dashboard impact, root cause analysis, and other lineage use cases.
Before you begin
Before you run these queries, make sure:
- Your Lakehouse setup is complete and Lakehouse is enabled for your organization. If setup isn't complete, see Get started with Lakehouse.
- The
LINEAGEtable is set up in your warehouse. This table is not part of the native GOLD namespace and must be created separately. See Set up lineage tables.
Full lineage export
These queries use the ASSETS table from the GOLD namespace in the Lakehouse catalog and the LINEAGE table, which you create separately in your warehouse (see Set up lineage tables).
The query returns lineage rows for assets updated in the last 2 days, including upstream and downstream dependencies. Use it to monitor recent updates, identify downstream dependencies for impact analysis, trace upstream sources during root cause investigations, and validate lineage changes after deployments.
The output includes the depth of each connection, which helps distinguish direct and indirect dependencies.
- Snowflake
- Databricks
- BigQuery
-- ============================================
-- FULL LINEAGE EXPORT
-- gold.assets: Lakehouse catalog (GOLD namespace)
-- LINEAGE: Customer-managed table (see "Set up lineage tables")
-- ============================================
SELECT
a.guid, -- Starting asset's unique identifier
a.asset_name, -- Starting asset name (e.g., DIM_CUSTOMERS)
a.asset_qualified_name, -- Full path including connection/database/schema
a.asset_type, -- Asset type (Table, View, Dashboard, etc.)
TO_TIMESTAMP_LTZ(a.updated_at / 1000) AS LAST_UPDATED, -- Convert epoch milliseconds to timestamp
DATEDIFF(
day,
TO_TIMESTAMP_LTZ(a.updated_at / 1000),
CURRENT_TIMESTAMP()
) AS DAYS_SINCE_UPDATE, -- Number of days since last update
l.start_name, -- Lineage start point name
l.start_type, -- Type at lineage start
l.related_name, -- Related asset name in the lineage path
l.related_type, -- Related asset type (Table, Dashboard, etc.)
l.direction, -- UPSTREAM (sources) or DOWNSTREAM (consumers)
l.level -- Depth in lineage (1=direct, 2=2-hops, etc.)
FROM gold.assets a
INNER JOIN LINEAGE l
ON a.guid = l.start_guid -- Join on asset identifier
WHERE
a.has_lineage -- Only include assets with lineage relationships
AND a.connector_name IN ('snowflake', 'redshift', 'bigquery') -- Focus on major data warehouses
AND DATEDIFF(
day,
TO_TIMESTAMP_LTZ(a.updated_at / 1000), -- Convert epoch milliseconds to timestamp
CURRENT_TIMESTAMP()
) <= 2 -- Only assets updated within last 2 days
GROUP BY
a.guid, a.asset_name, a.asset_qualified_name, a.asset_type,
a.updated_at,
l.start_name, l.start_type, l.related_name, l.related_type,
l.direction, l.level
ORDER BY
a.updated_at DESC, -- Most recently updated first
a.asset_qualified_name, -- Then group by asset
l.direction, -- Show upstream first, then downstream
l.level -- Order by proximity (closest first)
LIMIT 200;
-- ============================================
-- FULL LINEAGE EXPORT
-- gold.assets: Lakehouse catalog (GOLD namespace)
-- LINEAGE: Customer-managed table (see "Set up lineage tables")
-- ============================================
SELECT
a.guid, -- Starting asset's unique identifier
a.asset_name, -- Starting asset name (e.g., DIM_CUSTOMERS)
a.asset_qualified_name, -- Full path including connection/database/schema
a.asset_type, -- Asset type (Table, View, Dashboard, etc.)
TO_TIMESTAMP_LTZ(a.updated_at / 1000) AS LAST_UPDATED, -- Convert epoch milliseconds to timestamp
DATEDIFF(
day,
TO_TIMESTAMP_LTZ(a.updated_at / 1000),
CURRENT_TIMESTAMP()
) AS DAYS_SINCE_UPDATE, -- Number of days since last update
l.start_name, -- Lineage start point name
l.start_type, -- Type at lineage start
l.related_name, -- Related asset name in the lineage path
l.related_type, -- Related asset type (Table, Dashboard, etc.)
l.direction, -- UPSTREAM (sources) or DOWNSTREAM (consumers)
l.level -- Depth in lineage (1=direct, 2=2-hops, etc.)
FROM gold.assets a
INNER JOIN LINEAGE l
ON a.guid = l.start_guid -- Join on asset identifier
WHERE
a.has_lineage -- Only include assets with lineage relationships
AND a.connector_name IN ('snowflake', 'redshift', 'bigquery') -- Focus on major data warehouses
AND DATEDIFF(
day,
TO_TIMESTAMP_LTZ(a.updated_at / 1000), -- Convert epoch milliseconds to timestamp
CURRENT_TIMESTAMP()
) <= 2 -- Only assets updated within last 2 days
GROUP BY
a.guid, a.asset_name, a.asset_qualified_name, a.asset_type,
a.updated_at,
l.start_name, l.start_type, l.related_name, l.related_type,
l.direction, l.level
ORDER BY
a.updated_at DESC, -- Most recently updated first
a.asset_qualified_name, -- Then group by asset
l.direction, -- Show upstream first, then downstream
l.level -- Order by proximity (closest first)
LIMIT 200;
-- ============================================
-- FULL LINEAGE EXPORT
-- gold.assets: Lakehouse catalog (GOLD namespace)
-- LINEAGE: Customer-managed table (see "Set up lineage tables")
-- ============================================
SELECT
a.guid, -- Starting asset's unique identifier
a.asset_name, -- Starting asset name (e.g., DIM_CUSTOMERS)
a.asset_qualified_name, -- Full path including connection/database/schema
a.asset_type, -- Asset type (Table, View, Dashboard, etc.)
TIMESTAMP_MILLIS(CAST(a.updated_at AS INT64)) AS LAST_UPDATED, -- epoch ms -> TIMESTAMP
TIMESTAMP_DIFF(
CURRENT_TIMESTAMP(),
TIMESTAMP_MILLIS(CAST(a.updated_at AS INT64)),
DAY
) AS DAYS_SINCE_UPDATE, -- days since last update
l.start_name, -- Lineage start point name
l.start_type, -- Type at lineage start
l.related_name, -- Related asset name in the lineage path
l.related_type, -- Related asset type (Table, Dashboard, etc.)
l.direction, -- UPSTREAM (sources) or DOWNSTREAM (consumers)
l.level -- Depth in lineage (1=direct, 2=2-hops, etc.)
FROM gold.assets a
JOIN LINEAGE l
ON a.guid = l.start_guid -- Join on asset identifier
WHERE
a.has_lineage -- Only include assets with lineage relationships
AND a.connector_name IN ('snowflake', 'redshift', 'bigquery')
AND TIMESTAMP_DIFF(
CURRENT_TIMESTAMP(),
TIMESTAMP_MILLIS(CAST(a.updated_at AS INT64)),
DAY
) <= 2 -- Only assets updated within last 2 days
ORDER BY
a.updated_at DESC, -- Most recently updated first (epoch ms)
a.asset_qualified_name,
l.direction,
l.level
LIMIT 200;
Sample output
| ASSET_NAME | ASSET_TYPE | RELATED_NAME | RELATED_TYPE | DIRECTION | LEVEL |
|---|---|---|---|---|---|
| ACCOUNT_ID | Column | ACCOUNT_ID | PowerBIColumn | DOWNSTREAM | 1 |
| ASSESSMENT_ID | Column | ASSESSMENT_ID | PowerBIColumn | DOWNSTREAM | 1 |
| CREDIT_LIMIT | Column | CREDIT_LIMIT | PowerBIColumn | DOWNSTREAM | 1 |
| CREDIT_SCORE | Column | CREDIT_SCORE | PowerBIColumn | DOWNSTREAM | 1 |
| CREDIT_SCORE | Column | Executive Overview | PowerBIPage | DOWNSTREAM | 2 |
| CURRENT_BALANCE | Column | CURRENT_BALANCE | PowerBIColumn | DOWNSTREAM | 1 |
| CUSTOMER_ID | Column | CUSTOMER_ID | PowerBIColumn | DOWNSTREAM | 1 |
| CUSTOMER_ID | Column | Executive Overview | PowerBIPage | DOWNSTREAM | 2 |
| DEBT_TO_INCOME_RATIO | Column | DEBT_TO_INCOME_RATIO | PowerBIColumn | DOWNSTREAM | 1 |
| EXPOSURE_AT_DEFAULT | Column | EXPOSURE_AT_DEFAULT | PowerBIColumn | DOWNSTREAM | 1 |
| PROBABILITY_OF_DEFAULT | Column | PROBABILITY_OF_DEFAULT | PowerBIColumn | DOWNSTREAM | 1 |
| RECENT_INQUIRIES | Column | RECENT_INQUIRIES | PowerBIColumn | DOWNSTREAM | 1 |
| RISK_GRADE | Column | RISK_GRADE | PowerBIColumn | DOWNSTREAM | 1 |
| LAST_UPDATE | Column | Last Update | SigmaDataElementField | DOWNSTREAM | 1 |
| CART_TAX | Table | osha_inspection | Table | DOWNSTREAM | 1 |