Skip to main content

Build with lineage context

Use these workflows to handle catalog operations as part of the development process—analyzing downstream impact before merging schema changes, documenting SQL models with business context from the catalog, tracing data quality issues to their upstream source, and registering lineage for pipelines that aren't automatically tracked. These operations are typically required before a schema change ships or a new pipeline goes to production, and can be completed from your AI client without switching tools.

info

Make sure Atlan MCP is configured before running these workflows. For setup, see Set up Atlan MCP.

SQL impact analysis at code review

A rename or type change in one table can silently break views, dbt models, and dashboards two or three hops downstream. This workflow surfaces every consumer of a changed asset before the PR is merged, so you can notify owners and plan for the impact.

How it works
1
Identify tables or columns changed in the PR
semantic_search
2
Traverse downstream lineage to find all consumers
traverse_lineage
3
Return impacted assets with owners and certification status
get_asset
Example prompt
I'm about to rename the user_id column to customer_id in the dim_users table. Show me every downstream view, table, and dashboard that references dim_users so I can assess the impact before merging.

Document SQL with business context

SQL model documentation written in isolation is often too technical for business users. This workflow pulls the glossary terms and descriptions already in Atlan for the source assets and uses them to generate a business-readable description for your model.

How it works
1
Retrieve glossary terms and descriptions for source assets
semantic_search
2
Generate a business-readable description for the model
get_asset
3
Write the description back to the asset in Atlan
resolve_metadata
Example prompt
For the monthly_revenue_summary model in dbt, look up the Atlan descriptions and glossary terms for its source tables and write a business-friendly description explaining what it calculates and who it's for.

Dashboard root cause analysis

When a dashboard shows wrong numbers, the root cause is almost never in the dashboard itself—it's usually a schema change, bad transformation, or data issue two or three hops upstream. This workflow traces lineage from the dashboard back to the source to find the broken link.

How it works
1
Find the dashboard showing the anomaly
semantic_search
2
Traverse upstream lineage through source tables and transforms
traverse_lineage
3
Review metadata and announcements to pinpoint the issue
get_asset
Example prompt
The Weekly Sales Dashboard is showing revenue figures 30% lower than expected since Monday. Trace its upstream lineage and show me every source table and transformation in the chain along with their owners and any recent announcements.

Build lineage for custom pipelines

Custom scripts, Spark jobs, and internal ETL tools don't get lineage tracked automatically. Without lineage, impact analysis and root cause tracing are incomplete. This workflow registers the source-to-target relationship in Atlan so the full data flow is visible.

How it works
1
Identify source and target assets for the pipeline
semantic_search
2
Resolve their qualified names in Atlan
resolve_metadata
3
Register the lineage connection between source and target
get_asset
Example prompt
I have a custom Spark job that reads from raw.clickstream_events and writes to analytics.user_sessions. Register the lineage between these two tables in Atlan so it shows up in the lineage graph.

Onboard files from object storage

Files in S3 or GCS buckets are often invisible to the catalog until someone crawls them manually—by which time they're already being consumed without governance. This workflow registers them proactively with descriptions, owners, and tags before they're used downstream.

How it works
1
Identify the file paths or buckets to register
semantic_search
2
Create catalog assets with descriptions and schema info
resolve_metadata
3
Set owners, tags, and relevant custom metadata
update_assets
Example prompt
Register the parquet files in s3://data-lake/raw/transactions/ as catalog assets. Set the description to raw transaction events from the payment processor, add the Data Engineering team as owner, and tag them as raw data.

Build lineage where it doesn't exist

Legacy scripts, one-off exports, and manual transforms create derived assets with no lineage back to their source. This leaves gaps in impact analysis and makes root cause tracing unreliable. This workflow fills those gaps by registering the known transformation relationship in Atlan.

How it works
1
Identify source and derived assets with no lineage between them
semantic_search
2
Confirm the transformation relationship exists
get_asset
3
Register the lineage edge in Atlan
resolve_metadata
Example prompt
I know the reporting.customer_segments table is derived from analytics.customer_events but there's no lineage between them. Add the lineage connection so downstream impact analysis picks it up.

Build pipeline scaffolding

Starting a new pipeline means creating catalog entries, registering lineage, and writing documentation before the first row of data moves. This workflow generates all that boilerplate from a single description of what the pipeline does.

How it works
1
Define pipeline inputs, outputs, and transformation logic
semantic_search
2
Create catalog assets for any new tables or datasets
resolve_metadata
3
Register lineage and write initial documentation
update_assets
Example prompt
I'm building a new pipeline that reads from raw.web_events and writes to gold.daily_active_users. Create catalog entries for the output table with a description of what it contains, set the Data Engineering team as owner, and register the lineage from the source table.