Build with lineage context
Use these workflows to handle catalog operations as part of the development process—analyzing downstream impact before merging schema changes, documenting SQL models with business context from the catalog, tracing data quality issues to their upstream source, and registering lineage for pipelines that aren't automatically tracked. These operations are typically required before a schema change ships or a new pipeline goes to production, and can be completed from your AI client without switching tools.
Make sure Atlan MCP is configured before running these workflows. For setup, see Set up Atlan MCP.
SQL impact analysis at code review
A rename or type change in one table can silently break views, dbt models, and dashboards two or three hops downstream. This workflow surfaces every consumer of a changed asset before the PR is merged, so you can notify owners and plan for the impact.
I'm about to rename the user_id column to customer_id in the dim_users table. Show me every downstream view, table, and dashboard that references dim_users so I can assess the impact before merging.
Document SQL with business context
SQL model documentation written in isolation is often too technical for business users. This workflow pulls the glossary terms and descriptions already in Atlan for the source assets and uses them to generate a business-readable description for your model.
For the monthly_revenue_summary model in dbt, look up the Atlan descriptions and glossary terms for its source tables and write a business-friendly description explaining what it calculates and who it's for.
Dashboard root cause analysis
When a dashboard shows wrong numbers, the root cause is almost never in the dashboard itself—it's usually a schema change, bad transformation, or data issue two or three hops upstream. This workflow traces lineage from the dashboard back to the source to find the broken link.
The Weekly Sales Dashboard is showing revenue figures 30% lower than expected since Monday. Trace its upstream lineage and show me every source table and transformation in the chain along with their owners and any recent announcements.
Build lineage for custom pipelines
Custom scripts, Spark jobs, and internal ETL tools don't get lineage tracked automatically. Without lineage, impact analysis and root cause tracing are incomplete. This workflow registers the source-to-target relationship in Atlan so the full data flow is visible.
I have a custom Spark job that reads from raw.clickstream_events and writes to analytics.user_sessions. Register the lineage between these two tables in Atlan so it shows up in the lineage graph.
Onboard files from object storage
Files in S3 or GCS buckets are often invisible to the catalog until someone crawls them manually—by which time they're already being consumed without governance. This workflow registers them proactively with descriptions, owners, and tags before they're used downstream.
Register the parquet files in s3://data-lake/raw/transactions/ as catalog assets. Set the description to raw transaction events from the payment processor, add the Data Engineering team as owner, and tag them as raw data.
Build lineage where it doesn't exist
Legacy scripts, one-off exports, and manual transforms create derived assets with no lineage back to their source. This leaves gaps in impact analysis and makes root cause tracing unreliable. This workflow fills those gaps by registering the known transformation relationship in Atlan.
I know the reporting.customer_segments table is derived from analytics.customer_events but there's no lineage between them. Add the lineage connection so downstream impact analysis picks it up.
Build pipeline scaffolding
Starting a new pipeline means creating catalog entries, registering lineage, and writing documentation before the first row of data moves. This workflow generates all that boilerplate from a single description of what the pipeline does.
I'm building a new pipeline that reads from raw.web_events and writes to gold.daily_active_users. Create catalog entries for the output table with a description of what it contains, set the Data Engineering team as owner, and register the lineage from the source table.