Extract Databricks AI model lineage
Atlan supports lineage only for models tracked on Databricks-hosted MLflow tracking servers (where the tracking URI is databricks). External or self-hosted MLflow tracking servers aren't supported.
Once you have crawled Databricks AI models, Atlan can build lineage connecting those models to the upstream datasets, tables, and functions they depend on. This gives you end-to-end visibility into how data flows from source assets into trained model versions.
Prerequisites
Before extracting AI model lineage, make sure you have:
- A Unity Catalog-enabled Databricks workspace
- Crawled Databricks AI models at least once
- All permissions required for AI model crawling are in place
Extract lineage
Lineage is built automatically during the Databricks crawler run—no separate workflow is needed.
To extract AI model lineage:
-
Make sure the Databricks crawler is configured for AI models with the Direct extraction strategy.
-
If your models use Databricks Feature Store and the
feature_spec.yamlartifact is stored in an external location, grant read access to the Atlan service account:GRANT READ FILES ON EXTERNAL LOCATION <external_location_name> TO <atlan_user_or_role>;If the artifact is inaccessible, Atlan skips Feature Store lineage for that model version and falls back to run-based lineage where available.
-
Run the Databricks crawler workflow.
-
After the workflow completes, navigate to any AI Model Version asset in Atlan to view its lineage. The lineage graph shows upstream tables, feature views, and functions that fed into the model version.
Cross-workspace lineage for Databricks AI models isn't yet supported. Support for tracing lineage across Databricks workspaces is planned for a future release.
See also
- Crawl Databricks AI models: Discover and catalog AI models from your Unity Catalog Model Registry.
- How Atlan builds lineage for Databricks AI models: Understand the lineage sources, relationships created, and asset behavior.