Skip to main content

Querying gold layer

This page provides best practices and recommended approaches for querying and using the Gold layer efficiently.

Query foundations

Every Gold layer query needs a solid foundation. These practices help you build queries that stay consistent as metadata changes and perform well at scale.

Start with the ASSETS table

What to do: Begin every query with the ASSETS table, then join to specialized views as needed.

Why it matters: ASSETS table is your central catalog. It contains core metadata for every asset (tables, dashboards, columns, etc.) in one place. Starting here gives you a complete view before you add governance, lineage, or quality details.

How to implement:

  1. Start with SELECT * FROM ASSETS
  2. Add filters to narrow to your scope (connector, asset type, status)
  3. Join to detail views like TAGS, LINEAGE, or CUSTOM_METADATA only after filtering

Example:

-- Start with ASSETS, filter early, then join
SELECT
a.NAME,
a.ASSET_TYPE,
t.TAG_NAME
FROM ASSETS a
WHERE a.STATUS = 'ACTIVE'
AND a.CONNECTOR_NAME = 'snowflake'
LEFT JOIN TAGS t ON a.GUID = t.ASSET_GUID

See also: ASSETS reference for full column list and usage patterns.

Use GUID for joins

What to do: Always join tables using GUID fields, not asset names or qualified names.

Why it matters: GUIDs are permanent identifiers. Asset names can change (tables get renamed, dashboards get updated), but GUIDs stay the same. Joining on names can cause joins to break or return incomplete results after renames.

How to implement:

  • Join ASSETS to detail views using ASSETS.GUID = [detail_view].ASSET_GUID
  • Use asset names in your SELECT clause for readability, not in JOIN or WHERE clauses
  • If you must filter by name, do it after the join

Example:

--  Good: Join on GUID
SELECT
a.NAME,
l.DOWNSTREAM_ASSET_GUID
FROM ASSETS a
JOIN LINEAGE l ON a.GUID = l.UPSTREAM_ASSET_GUID

-- Avoid: Join on name
SELECT
a.NAME,
l.DOWNSTREAM_ASSET_GUID
FROM ASSETS a
JOIN LINEAGE l ON a.NAME = l.UPSTREAM_ASSET_NAME -- Brittle!

Filter and scope queries

Filtering early keeps queries fast and costs low. Apply the right filters at the right time to scan only the data you need.

Filter early in the query

What to do: Apply filters on ASSETS before joining to other tables.

Why it matters: Filtering a 10-row result set is cheaper than filtering 10 million rows after joins. Early filters reduce the amount of data scanned and the size of intermediate results, which improves performance and lowers warehouse costs.

How to implement:

  1. Apply your most selective filters first (status, connector, asset type)
  2. Filter on ASSETS before any joins
  3. Add detail view joins only after the base set is narrowed

Example:

--  Good: Filter on ASSETS first
SELECT
a.NAME,
cm.CUSTOM_METADATA_KEY,
cm.CUSTOM_METADATA_VALUE
FROM ASSETS a
WHERE a.STATUS = 'ACTIVE'
AND a.CONNECTOR_NAME = 'bigquery'
AND a.ASSET_TYPE = 'Table'
LEFT JOIN CUSTOM_METADATA cm ON a.GUID = cm.ASSET_GUID

-- Less efficient: Filter after joining
SELECT
a.NAME,
cm.CUSTOM_METADATA_KEY,
cm.CUSTOM_METADATA_VALUE
FROM ASSETS a
LEFT JOIN CUSTOM_METADATA cm ON a.GUID = cm.ASSET_GUID
WHERE a.STATUS = 'ACTIVE' -- Large join happens first

Use common filters for targeted analysis

What to do: Use these filters to narrow your result set before adding complexity.

Why it matters: Focused queries are faster and easier to maintain. These filters cover common scenarios and help you build reusable query templates.

What you wantColumn to filterExample
Specific connectorCONNECTOR_NAMEWHERE CONNECTOR_NAME = 'snowflake'
Specific asset typeASSET_TYPEWHERE ASSET_TYPE = 'Table'
Active assets onlySTATUSWHERE STATUS = 'ACTIVE'
Certified assetsCERTIFICATE_STATUSWHERE CERTIFICATE_STATUS = 'VERIFIED'
Assets with ownersOWNER_USERSWHERE ARRAY_SIZE(OWNER_USERS) > 0
Popular assetsPOPULARITY_SCOREWHERE POPULARITY_SCORE > 50
Assets with lineageHAS_LINEAGEWHERE HAS_LINEAGE = true
Assets with termsTERM_GUIDSWHERE ARRAY_SIZE(TERM_GUIDS) > 0
Assets with tagsTAGSWHERE ARRAY_SIZE(TAGS) > 0
Specific databaseASSET_QUALIFIED_NAMEWHERE ASSET_QUALIFIED_NAME LIKE '%/prod_db/%'

Example query:

-- Find all active Snowflake tables with owners
SELECT
NAME,
OWNER_USERS,
POPULARITY_SCORE
FROM ASSETS
WHERE STATUS = 'ACTIVE'
AND CONNECTOR_NAME = 'snowflake'
AND ASSET_TYPE = 'Table'
AND ARRAY_SIZE(OWNER_USERS) > 0
ORDER BY POPULARITY_SCORE DESC

Understand the model

Gold layer is a set of related views, not isolated tables. Use the model to plan joins and understand how assets relate across metadata, lineage, and governance.

Use the entity-relationship diagram

What to do: Reference the ER diagram before building complex queries with multiple joins.

Why it matters: Diagram shows how views connect to each other. Understanding relationships helps you choose the right join path and avoid incomplete or duplicate results.

How to implement:

  1. Open the ER diagram below
  2. Identify your starting point (usually ASSETS)
  3. Trace the relationship path to your target view
  4. Confirm whether joins are 1:1, 1:many, or many:many

Diagram:

Lakehouse Gold layer ER diagram

Example use case: If you want lineage data, the diagram shows that LINEAGE joins directly to ASSETS via ASSET_GUID. If you want custom metadata, CUSTOM_METADATA also joins directly to ASSETS.

Query quality and safety

These practices help you avoid silent errors and unexpected results.

Handle nulls explicitly

What to do: Use COALESCE or explicit NULL checks when filtering or aggregating optional fields.

Why it matters: Many fields in the Gold layer are optional (owners, tags, descriptions). SQL WHERE clauses exclude rows with NULL values by default, which can silently drop assets from your results.

How to implement:

  • Use COALESCE(column, default_value) when aggregating
  • Use column IS NULL OR column = 'value' when filtering
  • Check for empty arrays with ARRAY_SIZE(column) = 0 instead of assuming NULL

Example:

--  Good: Handle NULLs explicitly
SELECT
NAME,
COALESCE(DESCRIPTION, 'No description') AS DESCRIPTION,
ARRAY_SIZE(COALESCE(OWNER_USERS, [])) AS OWNER_COUNT
FROM ASSETS
WHERE STATUS = 'ACTIVE'

-- Risky: NULL description silently excluded
SELECT NAME, DESCRIPTION
FROM ASSETS
WHERE DESCRIPTION LIKE '%customer%' -- Drops rows where DESCRIPTION IS NULL

Test with small result sets

What to do: Add LIMIT 100 to queries during development, then remove it for production runs.

Why it matters: Testing on small result sets reduces compute costs and speeds up iteration. You can validate query logic without scanning millions of rows.

How to implement:

  1. Add LIMIT 100 to your query
  2. Validate column names, joins, and logic
  3. Remove the limit when results look correct
  4. Run the full query or schedule it in a dashboard

Example:

-- During development
SELECT
a.NAME,
t.TAG_NAME
FROM ASSETS a
LEFT JOIN TAGS t ON a.GUID = t.ASSET_GUID
WHERE a.STATUS = 'ACTIVE'
LIMIT 100; -- Remove this for production

Joins for common use cases

In the Gold layer, the ASSETS table is the core table that lists all assets and important attributes. However, the Gold layer also contains complementary views that contain details for specific asset types. For example, the RELATIONAL_ASSET_DETAILS view contains domain-specific details for relational assets in Atlan, such as the list of columns for a table, so you'll need to join the ASSETS table with the RELATIONAL_ASSET_DETAILS as part of your query.

This section outlines common relationships and joins that may be helpful.

GUID as the universal key

Every asset has a unique GUID. This is your primary key for joining tables.

Tag use cases

For tag-based use cases (for example, find all assets with specific tags, analyze tag coverage, compliance reporting), join ASSETS with the TAGS view using ASSETS.GUID = TAGS.ASSET_GUID as the join condition.

Custom metadata use cases

For custom metadata use cases (for example, find assets with specific custom attributes you have defined in Atlan), join ASSETS with the CUSTOM_METADATA view using ASSETS.GUID = CUSTOM_METADATA.ASSET_GUID as the join condition.

README use cases

To get the full readme text for an asset (helpful in queries that search documentation or find undocumented assets, or general knowledge base queries), join ASSETS with the README view using ASSETS.README_GUID = README.GUID as the join condition.

Lineage use cases

The Gold layer has a LINEAGE table that enables simple impact analysis, root cause analysis, data flow visualization, and change management. Every row in the LINEAGE table has a START_GUID column that describes the starting point for lineage, and a RELATED_GUID column that describes the adjacent related asset. Use ASSETS.GUID = LINEAGE.START_GUID to get details about the starting asset, and LINEAGE.RELATED_GUID = ASSETS.GUID to get related asset details.

Glossary use cases

The GLOSSARY_DETAILS view contains details about glossaries, categories, and terms in Atlan. To find which terms are on an asset, join ASSETS with GLOSSARY_DETAILS using ASSETS.TERM_GUIDS contains GLOSSARY_DETAILS.GUID.

To understand term adoption analysis, finding assets by business term, and build glossary coverage reports, join GLOSSARY_DETAILS with ASSETS using GLOSSARY_DETAILS.ASSIGNED_ENTITIES contains ASSETS.GUID.

GLOSSARY_DETAILS also has an ANCHOR_GUID column that describes the glossary that the term or category belongs to, so you can understand glossary structure or hierarchy (for example, link a term or category to its parent glossary) or find all terms in a glossary by doing a self-join on GLOSSARY_DETAILS using GLOSSARY_DETAILS.ANCHOR_GUID = GLOSSARY_DETAILS.GUID.

Exploring relational asset hierarchy

The RELATIONAL_ASSET_DETAILS view contains details about all relational assets registered in Atlan, including databases, schemas, tables, and columns. Details also include structural information about each asset that can be used to explore schemas and database inventory, such as DATABASE_SCHEMAS (GUIDs for schemas in a database) and TABLE_COLUMNS (GUIDs for columns in a table).

Understanding pipelines and data sources/outputs

You can use the PIPELINE_DETAILS view to understand pipeline dependency analysis, map ETL/ELT flows, and track transformations. Each row in PIPELINE_DETAILS has INPUT_GUIDS_TO_PROCESSES and OUTPUT_GUIDS_TO_PROCESSES columns that are arrays of input/output assets to each process, that you can then flatten and join with ASSETS for further analysis.

Data quality use cases

To monitor data quality checks, track failed checks, and build asset quality reports, join the DATA_QUALITY_DETAILS view with ASSETS using DATA_QUALITY_DETAILS.GUID = ASSETS.GUID as the join key.