Metadata export
Organizations need to export metadata from Atlan to support AI applications, internal data marketplaces, and governance workflows across source systems. Use Lakehouse as a single place to query enriched metadata and feed it into downstream tools and applications.
Before you begin
Before you run these queries, make sure:
- Your Lakehouse setup is complete and Lakehouse is enabled for your organization. If setup isn't complete, see Get started with Lakehouse.
The Gold namespace is built natively into the catalog -- no additional setup is needed. You can run these queries directly on the gold.ASSETS table and the raw tables in entity_metadata (such as entity_metadata.Readme, entity_metadata.CustomMetadata, and entity_metadata.TagRelationship) to export metadata for AI assistants, data marketplaces, and reverse sync workflows.
Retrieve asset metadata
Use this query to retrieve assets from Atlan with enrichment metadata that you can export to external tools. It's particularly useful for:
- Finding assets with good documentation. Update the
WHEREclause to filter assets that match your use case. Several examples are shown in the section below. - Powering data discovery, quality, or governance dashboards. Update the
<database>placeholder to match your environment. - Identifying certified and actively used assets.
USE <database>;
SELECT
A.asset_name, -- Human-readable name of the asset
A.guid, -- Unique identifier for the asset
A.asset_qualified_name, -- Fully qualified name (e.g., db.schema.table)
A.asset_type, -- Type of asset: Table, Column, View, etc.
A.description, -- System description
A.user_description, -- User-authored description
A.status, -- Lifecycle status: ACTIVE, DELETED
A.certificate_status, -- Certification state: VERIFIED, DRAFT, DEPRECATED
A.owner_users, -- Array of asset owners
tr.TAGS, -- Array of tags applied to the asset
R.description AS README_TEXT,-- Rich documentation from README (if present)
A.popularity_score, -- Usage-based popularity metric
A.has_lineage -- Boolean indicating if lineage exists
FROM
gold.assets A
LEFT JOIN
entity_metadata.Readme R
ON A.readme_guid = R.GUID -- Join README content to the asset
LEFT JOIN (
SELECT guid, ARRAY_AGG(tagName) AS TAGS
FROM entity_metadata.TagRelationship
WHERE status = 'ACTIVE'
GROUP BY guid
) tr ON A.guid = tr.guid
WHERE
A.asset_type IN ('Table', 'Column', 'View') -- Focus on core data assets
AND A.connector_name IN ('snowflake', 'redshift', 'bigquery') -- Cloud warehouses
AND R.description IS NOT NULL -- Only return assets with README documentation
LIMIT 10; -- Sample output for inspection
Sample output
| ASSET_NAME | ASSET_TYPE | DESCRIPTION | USER_DESCRIPTION | STATUS | CERTIFICATE_STATUS | OWNER_USERS | TAGS | HAS_LINEAGE |
|---|---|---|---|---|---|---|---|---|
| instacart_orders | Table | Transaction log with order timing and reorder patterns | Key dataset for reorder analysis | ACTIVE | VERIFIED | andrew | Attribute, Finance, Cost_center, Confidential | TRUE |
| dwh_dmatrizcampana | Table | Central source of truth for date, holiday, and fiscal periods | ACTIVE | VERIFIED | Attribute, Customer Service, Confidential | FALSE |
Retrieve asset enrichment with custom metadata
- Snowflake
- Databricks
- BigQuery
USE <database>;
-- ============================================
-- ASSET ENRICHMENT WITH CUSTOM METADATA SCORES
-- ============================================
WITH CM_DETAILS AS (
SELECT
CM.GUID,
-- Create JSON object using LISTAGG and PARSE_JSON
PARSE_JSON(
'{' ||
LISTAGG(
'"' || CM.attributeName || '":"' ||
REPLACE(COALESCE(CM.attributeValue, ''), '"', '\\"') || '"',
','
) WITHIN GROUP (ORDER BY CM.attributeName)
|| '}'
) AS CM_ATTRIBUTES_JSON
FROM entity_metadata.CustomMetadata CM
WHERE
CM.STATUS = 'ACTIVE'
AND CM.customMetadataSetName = 'AI Readiness' -- 🔧 CUSTOMIZE
AND CM.attributeValue IS NOT NULL
GROUP BY CM.GUID
)
SELECT
A.asset_name,
A.guid,
A.asset_qualified_name,
A.asset_type,
A.description,
A.user_description,
R.description AS README_TEXT,
A.status,
A.certificate_status,
A.owner_users,
tr.TAGS,
A.popularity_score,
A.has_lineage,
-- All Custom Metadata attributes as JSON
CMS.CM_ATTRIBUTES_JSON
FROM gold.assets A
LEFT JOIN entity_metadata.Readme R ON A.readme_guid = R.GUID
LEFT JOIN CM_DETAILS CMS ON A.guid = CMS.GUID
LEFT JOIN (
SELECT guid, ARRAY_AGG(tagName) AS TAGS
FROM entity_metadata.TagRelationship
WHERE status = 'ACTIVE'
GROUP BY guid
) tr ON A.guid = tr.guid
WHERE
A.asset_type IN ('Table', 'Column', 'View')
AND A.connector_name IN ('snowflake', 'redshift', 'bigquery')
AND A.status = 'ACTIVE'
AND CMS.CM_ATTRIBUTES_JSON IS NOT NULL
AND A.owner_users IS NOT NULL
LIMIT 5;
Replace the owner value, such as [email protected], to match your environment.
USE <database>;
-- ============================================
-- ASSET ENRICHMENT WITH CUSTOM METADATA SCORES
-- ============================================
WITH CM_JSON AS (
SELECT
CM.GUID,
-- Store as parallel arrays (simpler, more reliable)
collect_list(CM.attributeName) AS SCORE_ATTRIBUTE_NAMES,
collect_list(CM.attributeValue) AS SCORE_ATTRIBUTE_VALUES,
-- Also create JSON string for easy extraction
concat(
'{',
concat_ws(
',',
collect_list(
concat(
'"', CM.attributeName, '":"',
regexp_replace(coalesce(CM.attributeValue, ''), '"', '\\\\"'), '"'
)
)
),
'}'
) AS CM_ATTRIBUTES_JSON
FROM entity_metadata.CustomMetadata CM
WHERE
CM.STATUS = 'ACTIVE'
AND CM.customMetadataSetName = 'AI Readiness' -- 🔧 CUSTOMIZE: Your CM set name
AND CM.attributeValue IS NOT NULL
GROUP BY CM.GUID
)
SELECT
-- ========================================
-- ASSET IDENTIFICATION
-- ========================================
A.asset_name,
A.guid,
A.asset_qualified_name,
A.asset_type,
-- ========================================
-- DOCUMENTATION
-- ========================================
A.description,
A.user_description,
R.description AS README_TEXT,
-- ========================================
-- GOVERNANCE METADATA
-- ========================================
A.status,
A.certificate_status,
A.owner_users,
tr.TAGS,
-- ========================================
-- USAGE METRICS
-- ========================================
A.popularity_score,
A.has_lineage,
-- ========================================
-- CUSTOM METADATA (Multiple Formats)
-- ========================================
-- Option 1: JSON string for extraction
CMS.CM_ATTRIBUTES_JSON
FROM gold.assets A
LEFT JOIN entity_metadata.Readme R
ON A.readme_guid = R.GUID
LEFT JOIN CM_JSON CMS
ON A.guid = CMS.GUID
LEFT JOIN (
SELECT guid, collect_list(tagName) AS TAGS
FROM entity_metadata.TagRelationship
WHERE status = 'ACTIVE'
GROUP BY guid
) tr ON A.guid = tr.guid
WHERE
A.connector_name IN ('snowflake', 'redshift', 'bigquery', 'oracle')
AND A.status = 'ACTIVE'
AND CMS.CM_ATTRIBUTES_JSON IS NOT NULL
AND A.owner_users IS NOT NULL
AND size(A.owner_users) > 0
LIMIT 10;
SET @@dataset_id = '<database>';
-- ============================================
-- ASSET ENRICHMENT WITH CUSTOM METADATA SCORES
-- ============================================
WITH CM_JSON AS (
SELECT
CM.GUID,
-- Construct JSON object as string
PARSE_JSON(
CONCAT(
'{',
STRING_AGG(
CONCAT(
'"', CM.attributeName, '":"',
IFNULL(CM.attributeValue, ''), '"'
),
','
ORDER BY CM.attributeName
),
'}'
)
) AS CM_ATTRIBUTES_JSON
FROM entity_metadata.CustomMetadata CM
WHERE
CM.STATUS = 'ACTIVE'
AND CM.customMetadataSetName = 'AI Readiness' -- 🔧 CUSTOMIZE
GROUP BY CM.GUID
)
SELECT
A.asset_name,
A.guid,
A.asset_qualified_name,
A.asset_type,
A.description,
A.user_description,
R.description AS README_TEXT,
A.status,
A.certificate_status,
A.owner_users,
tr.TAGS,
A.popularity_score,
A.has_lineage,
CMS.CM_ATTRIBUTES_JSON,
-- Extract specific attributes
JSON_EXTRACT_SCALAR(CMS.CM_ATTRIBUTES_JSON, '$.data_quality_score') AS data_quality_score,
JSON_EXTRACT_SCALAR(CMS.CM_ATTRIBUTES_JSON, '$.ai_readiness_score') AS ai_readiness_score
FROM gold.assets A
LEFT JOIN entity_metadata.Readme R ON A.readme_guid = R.GUID
LEFT JOIN CM_JSON CMS ON A.guid = CMS.GUID
LEFT JOIN (
SELECT guid, ARRAY_AGG(tagName) AS TAGS
FROM entity_metadata.TagRelationship
WHERE status = 'ACTIVE'
GROUP BY guid
) tr ON A.guid = tr.guid
WHERE
A.connector_name IN ('snowflake', 'redshift', 'bigquery', 'oracle')
AND A.status = 'ACTIVE'
AND CMS.CM_ATTRIBUTES_JSON IS NOT NULL
AND A.owner_users IS NOT NULL
AND ARRAY_LENGTH(A.owner_users) > 0
LIMIT 10;
Sample output
| ASSET_NAME | ASSET_TYPE | DESCRIPTION | STATUS | CERTIFICATE_STATUS | OWNER_USERS | HAS_LINEAGE | CM_ATTRIBUTES_JSON |
|---|---|---|---|---|---|---|---|
| ACCOUNTS | View | ACTIVE | FALSE | pii_risk_level_for_ai: medium | |||
| POSITIONS | Table | Provides details of job roles or positions | ACTIVE | VERIFIED | ravi | FALSE | ai_certification_status: In Review, pii_risk_level_for_ai: medium |
| average_sale_price_per_model_per_year_apac | Table | ACTIVE | VERIFIED | christopher.perro, esthephany.ayala | TRUE | pii_risk_level_for_ai: medium | |
| INSTACART_ALL_ORDER_PRODUCTS_USER | Table | Comprehensive record of customer orders | ACTIVE | VERIFIED | bryan | TRUE | representativeness_checked: True |
| CLIENT_BRANCH_LOAN_CLEANSED | Table | Information on individual loans and branch revenue | ACTIVE | VERIFIED | sean.uptigrove | TRUE | pii_risk_level_for_ai: high, representativeness_checked: True |