Skip to main content

Set up lineage tables

This guide walks you through creating a helper LINEAGE view in your warehouse that works with the GOLD namespace's native LINEAGE_ADJACENCY_LIST table. With this view, you can run advanced lineage queries that include direction (upstream/downstream), hop level, and asset names in a single result set.

Prerequisites

Before you begin, make sure that:

  • Lakehouse is enabled for your tenant and the status shows Ready. See Enable Lakehouse.
  • Your query engine is connected to Lakehouse. See Connect Snowflake or Connect Athena.
  • You have write access to a database in your warehouse where you can create views and tables. (The Lakehouse catalog is read-only.)

Create LINEAGE view

The LINEAGE view is built on top of a BASE_EDGES table that enriches lineage edges with asset names and types. Follow these steps to create both objects in your warehouse.

  1. In your Snowflake warehouse, open a SQL editor and replace the placeholder values in the SQL below:

    • Replace MY_DATABASE.MY_SCHEMA with your target database and schema
    • Replace ATLAN_CONTEXT_STORE with your Lakehouse catalog database name

    Then, create the BASE_EDGES table by running the following SQL. This table joins each edge in LINEAGE_ADJACENCY_LIST to the ASSETS table to attach human-readable names and types to both ends:

CREATE OR REPLACE TABLE MY_DATABASE.MY_SCHEMA.BASE_EDGES (
input_guid,
input_name,
input_type,
output_guid,
output_name,
output_type
)
COMMENT = 'Lineage edges enriched with asset names and types'
AS
SELECT DISTINCT
e.from_guid AS input_guid,
COALESCE(i.asset_name, e.from_guid) AS input_name,
COALESCE(i.asset_type, 'unknown') AS input_type,
e.to_guid AS output_guid,
COALESCE(o.asset_name, e.to_guid) AS output_name,
COALESCE(o.asset_type, 'unknown') AS output_type
FROM ATLAN_CONTEXT_STORE."gold".lineage_adjacency_list e
LEFT JOIN (
SELECT DISTINCT guid, asset_name, asset_type
FROM ATLAN_CONTEXT_STORE."gold".assets
) i ON e.from_guid = i.guid
LEFT JOIN (
SELECT DISTINCT guid, asset_name, asset_type
FROM ATLAN_CONTEXT_STORE."gold".assets
) o ON e.to_guid = o.guid
WHERE e.from_guid IS NOT NULL
AND e.to_guid IS NOT NULL;
  1. Create the LINEAGE view by running the following SQL. This view computes multi-hop upstream and downstream lineage using a recursive CTE:
CREATE OR REPLACE VIEW MY_DATABASE.MY_SCHEMA.LINEAGE (
direction,
start_guid,
start_name,
start_type,
related_guid,
related_name,
related_type,
connecting_guid,
level
)
COMMENT = 'Multi-hop upstream and downstream lineage for all assets'
AS
WITH
DOWNSTREAM AS (
SELECT
input_guid AS start_guid,
input_name AS start_name,
input_type AS start_type,
output_guid AS related_guid,
output_name AS related_name,
output_type AS related_type,
input_guid AS connecting_guid,
1 AS level,
input_guid || ',' || output_guid AS path_str
FROM MY_DATABASE.MY_SCHEMA.BASE_EDGES
UNION ALL
SELECT
d.start_guid,
d.start_name,
d.start_type,
e.output_guid AS related_guid,
e.output_name AS related_name,
e.output_type AS related_type,
d.related_guid AS connecting_guid,
d.level + 1 AS level,
d.path_str || ',' || e.output_guid AS path_str
FROM DOWNSTREAM d
JOIN MY_DATABASE.MY_SCHEMA.BASE_EDGES e
ON d.related_guid = e.input_guid
WHERE POSITION(e.output_guid IN d.path_str) = 0
),
UPSTREAM AS (
SELECT
output_guid AS start_guid,
output_name AS start_name,
output_type AS start_type,
input_guid AS related_guid,
input_name AS related_name,
input_type AS related_type,
output_guid AS connecting_guid,
1 AS level,
output_guid || ',' || input_guid AS path_str
FROM MY_DATABASE.MY_SCHEMA.BASE_EDGES
UNION ALL
SELECT
u.start_guid,
u.start_name,
u.start_type,
e.input_guid AS related_guid,
e.input_name AS related_name,
e.input_type AS related_type,
u.related_guid AS connecting_guid,
u.level + 1 AS level,
u.path_str || ',' || e.input_guid AS path_str
FROM UPSTREAM u
JOIN MY_DATABASE.MY_SCHEMA.BASE_EDGES e
ON u.related_guid = e.output_guid
WHERE POSITION(e.input_guid IN u.path_str) = 0
)
SELECT DISTINCT
'DOWNSTREAM' AS direction,
start_guid, start_name, start_type,
related_guid, related_name, related_type,
connecting_guid, level
FROM DOWNSTREAM
WHERE related_guid IS NOT NULL
UNION ALL
SELECT DISTINCT
'UPSTREAM' AS direction,
start_guid, start_name, start_type,
related_guid, related_name, related_type,
connecting_guid, level
FROM UPSTREAM
WHERE related_guid IS NOT NULL;

Verify setup

After creating both objects, run a quick validation:

-- Check that edges were populated
SELECT COUNT(*) FROM MY_DATABASE.MY_SCHEMA.BASE_EDGES;

-- Check that lineage traversal works
SELECT * FROM MY_DATABASE.MY_SCHEMA.LINEAGE LIMIT 10;

Refresh data

The BASE_EDGES table is a materialized snapshot. To pick up new lineage connections, re-run the Step 1 CREATE OR REPLACE TABLE statement on a schedule that suits your needs (for example, hourly or daily). The LINEAGE view always reads live data from BASE_EDGES, so it doesn't need a separate refresh.

Next steps

Once you've created the LINEAGE view, you can run advanced lineage queries: