Build with lineage context

Connect docs via MCP

Use these workflows to handle catalog operations as part of the development process—analyzing downstream impact before merging schema changes, documenting SQL models with business context from the catalog, tracing data quality issues to their upstream source, and registering lineage for pipelines that aren't automatically tracked. These operations are typically required before a schema change ships or a new pipeline goes to production, and can be completed from your AI client without switching tools.

info

Make sure Atlan MCP is configured before running these workflows. For setup, see Set up Atlan MCP.

SQL impact analysis at code review

A rename or type change in one table can silently break views, dbt models, and dashboards two or three hops downstream. This workflow surfaces every consumer of a changed asset before the PR is merged, so you can notify owners and plan for the impact.

How it works

Identify tables or columns changed in the PR

semantic_search

Traverse downstream lineage to find all consumers

traverse_lineage

Return impacted assets with owners and certification status

get_asset

Example prompt

I'm about to rename the user_id column to customer_id in the dim_users table. Show me every downstream view, table, and dashboard that references dim_users so I can assess the impact before merging.

Document SQL with business context

SQL model documentation written in isolation is often too technical for business users. This workflow pulls the glossary terms and descriptions already in Atlan for the source assets and uses them to generate a business-readable description for your model.

How it works

Retrieve glossary terms and descriptions for source assets

semantic_search

Generate a business-readable description for the model

get_asset

Write the description back to the asset in Atlan

resolve_metadata

Example prompt

For the monthly_revenue_summary model in dbt, look up the Atlan descriptions and glossary terms for its source tables and write a business-friendly description explaining what it calculates and who it's for.

Dashboard root cause analysis

When a dashboard shows wrong numbers, the root cause is almost never in the dashboard itself—it's usually a schema change, bad transformation, or data issue two or three hops upstream. This workflow traces lineage from the dashboard back to the source to find the broken link.

How it works

Find the dashboard showing the anomaly

semantic_search

Traverse upstream lineage through source tables and transforms

traverse_lineage

Review metadata and announcements to pinpoint the issue

get_asset

Example prompt

The Weekly Sales Dashboard is showing revenue figures 30% lower than expected since Monday. Trace its upstream lineage and show me every source table and transformation in the chain along with their owners and any recent announcements.

Build lineage for custom pipelines

Custom scripts, Spark jobs, and internal ETL tools don't get lineage tracked automatically. Without lineage, impact analysis and root cause tracing are incomplete. This workflow registers the source-to-target relationship in Atlan so the full data flow is visible.

How it works

Identify source and target assets for the pipeline

semantic_search

Resolve their qualified names in Atlan

resolve_metadata

get_asset

Example prompt

I have a custom Spark job that reads from raw.clickstream_events and writes to analytics.user_sessions. Register the lineage between these two tables in Atlan so it shows up in the lineage graph.

Onboard files from object storage

Files in S3 or GCS buckets are often invisible to the catalog until someone crawls them manually—by which time they're already being consumed without governance. This workflow registers them proactively with descriptions, owners, and tags before they're used downstream.

How it works

Identify the file paths or buckets to register

semantic_search

Create catalog assets with descriptions and schema info

resolve_metadata

Set owners, tags, and relevant custom metadata

update_assets

Example prompt

Register the parquet files in s3://data-lake/raw/transactions/ as catalog assets. Set the description to raw transaction events from the payment processor, add the Data Engineering team as owner, and tag them as raw data.

Build lineage where it doesn't exist

Legacy scripts, one-off exports, and manual transforms create derived assets with no lineage back to their source. This leaves gaps in impact analysis and makes root cause tracing unreliable. This workflow fills those gaps by registering the known transformation relationship in Atlan.

How it works

Identify source and derived assets with no lineage between them

semantic_search

Confirm the transformation relationship exists

get_asset

resolve_metadata

Example prompt

I know the reporting.customer_segments table is derived from analytics.customer_events but there's no lineage between them. Add the lineage connection so downstream impact analysis picks it up.

Build pipeline scaffolding

Starting a new pipeline means creating catalog entries, registering lineage, and writing documentation before the first row of data moves. This workflow generates all that boilerplate from a single description of what the pipeline does.

How it works

Define pipeline inputs, outputs, and transformation logic

semantic_search

Create catalog assets for any new tables or datasets

resolve_metadata

update_assets

Example prompt

I'm building a new pipeline that reads from raw.web_events and writes to gold.daily_active_users. Create catalog entries for the output table with a description of what it contains, set the Data Engineering team as owner, and register the lineage from the source table.

SQL impact analysis at code review​

Document SQL with business context​

Dashboard root cause analysis​

Build lineage for custom pipelines​

Onboard files from object storage​

Build lineage where it doesn't exist​

Build pipeline scaffolding​

SQL impact analysis at code review

Document SQL with business context

Dashboard root cause analysis

Build lineage for custom pipelines

Onboard files from object storage

Build lineage where it doesn't exist

Build pipeline scaffolding