Skip to main content

How Atlan builds lineage for Databricks AI models

This FAQ covers how Atlan constructs lineage for Databricks AI models, what relationships are created, and how assets like notebooks and functions are handled.

How does Atlan build lineage for Databricks AI models?

Atlan uses two sources to construct lineage for each model version:

  • Run details: For model versions with an associated MLflow run, Atlan reads the input datasets logged to that run. Tables referenced as inputs are linked as upstream dependencies of the model version.
  • Feature store artifacts: For feature engineering models, Atlan reads the feature_spec.yaml artifact to discover upstream feature views, tables, and functions used during training.

What lineage relationships does Atlan create for AI models?

Upstream assetDownstream assetSource
TableAI model versionInput datasets in MLflow run details
TableAI model versionfeature_spec.yaml artifact
FunctionAI model versionfeature_spec.yaml artifact

Why do I see Notebook and Function assets appear in Atlan after lineage extraction?

When Atlan encounters a notebook or function during lineage extraction, it creates a corresponding Notebook or Function asset in Atlan so those assets can be navigated and governed alongside other data assets.

Why do notebook and job details appear in lineage without creating asset-to-asset edges?

Yes. When a model version is linked to a Databricks notebook, job, or project, Atlan captures those details as additional ETL context on the lineage process entity. This context is visible in the lineage panel but doesn't create separate asset-to-asset lineage edges.


See also