Skip to main content

Lineage Builder
App

The Lineage Builder app enables you to create lineage relationships between any source and target assets in Atlan by importing lineage data from a CSV file. This app is useful for adding custom lineage connections that aren't automatically discovered by Atlan's connectors, or for injecting lineage from transformation tools that Atlan doesn't mine out-of-the-box.

The app processes each row in your CSV file to create or update lineage processes that connect source assets to target assets. You can configure how the app handles assets that don't exist in Atlan, whether to create partial assets that appear only in lineage, or full assets that are discoverable through search.

Configuration

This section defines the fields required for workflow setup.

Workflow name

Specifies a unique and descriptive name to identify the workflow configuration in the Atlan interface. This name appears in the workflow list and helps distinguish it from other lineage workflows.

Example:

production-etl-lineage-import

Source

Defines how you want to provide the input CSV file containing lineage data to be processed.

Upload a CSV file directly into Atlan from your local system. This method is recommended for ad-hoc or one-time imports.

  • Lineage file: The CSV file containing lineage details to load. The file must follow the format described in the Lineage builder CSV file section.

  • File size is generally limited to ~10 MB. For larger files, use the Object storage option described below.

Options

This section defines the fields required to build lineage from a CSV file.

Unknown asset handling

Specifies how you want to handle source and target assets in the input file that don't match any assets in Atlan.

Any source or target assets in the input file that don't match any asset in Atlan are skipped. Those lineage rows aren't loaded. Use this option when you want to create lineage only for assets that already exist in Atlan, ensuring data quality and preventing orphaned lineage relationships.

Example: Your CSV file contains lineage from RAW.ORDERS (exists in Atlan) to STAGING.ORDERS_NEW (doesn't exist in Atlan). With Skip them selected, the entire lineage row is skipped because the target asset doesn't exist. No lineage is created, and STAGING.ORDERS_NEW isn't created in Atlan.

Fail on errors

Specifies whether an invalid value in a field causes the import to fail or logs a warning and proceeds.

  • Yes: The import as a whole is marked as failed if any row contains invalid data. Lineage for any rows after the failure is detected may not be created. It won't rollback any rows that have already been processed up to that point.
  • No: Invalid values log a warning, the app skips that row, and then continues processing remaining rows.

Example: If a row contains an invalid connection name, selecting Yes fails the entire import, while selecting No skips that row and processes the rest of the file.

Case-sensitive match for assets

Determines whether attempts to match assets are done case-sensitively or case-insensitively.

  • Yes: Asset matching is case-sensitive. For example, Orders matches only Orders and not orders.
  • No: Asset matching is case-insensitive. For example, Orders matches both Orders and orders.

Use case-sensitive matching when your source systems have strict naming conventions where case differences indicate different assets.

Field separator

The single character that's used to separate values in the input file.

  • Typically either a comma (,) or a semicolon (;).
  • Must match the actual separator used in your CSV file.

Example:

,

Batch size

The maximum number of rows of input to process per underlying API call.

  • Larger batch sizes can improve performance but may increase memory usage.
  • Smaller batch sizes provide better error isolation but may be slower.

Example:

100

Lineage builder CSV file

The Lineage Builder CSV file defines lineage relationships between source and target assets. Each row represents a single lineage process to be created or updated.

Connections must pre-exist

This app won't create any connections. The connections referenced in the CSV file must already exist in Atlan before running the workflow.

Required fields

  • Source type: The type of the source (input, upstream) asset in the lineage. Must match a valid asset type in Atlan, such as Table, View, MaterialisedView, Column, or S3Object.

    Example: Table (specifies the source asset is a table)

  • Source connector: The type of connector for the source asset, indicating the kind of source system. Must match a valid connector type in Atlan, such as snowflake, mssql, postgresql, or s3.

    Example: mssql (indicates the source is from Microsoft SQL Server)

  • Source connection: The name of the connection in Atlan that contains the source asset. Must match an existing connection name in Atlan. If the asset doesn't exist and creation is permitted, this is the connection in which the source asset is created.

    Example: app-raw (the connection name in Atlan)

  • Source Identity: The unique identity of the source asset in Atlan within the connection. Represents the asset's qualified name path, excluding the connection portion. The format typically follows a hierarchical structure such as database/schema/table or database/schema/table/column.

    Example: db1/schema1/source_table (the path to the source asset within the connection)

  • Source name: The simple display name of the source asset. Represents the name that appears in the Atlan interface.

    Example: source_table (the display name of the source asset)

  • Target type: The type of the target (output, downstream) asset in the lineage. Must match a valid asset type in Atlan, such as Table, View, MaterialisedView, Column, or S3Object.

    Example: View (specifies the target asset is a view)

  • Target connector: The type of connector for the target asset, indicating the kind of target system. Must match a valid connector type in Atlan, such as snowflake, mssql, postgresql, or s3.

    Example: snowflake (indicates the target is in Snowflake)

  • Target connection: The name of the connection in Atlan that contains the target asset. Must match an existing connection name in Atlan. If the asset doesn't exist and creation is permitted, this is the connection in which the target asset is created.

    Example: prod-analytics (the connection name in Atlan)

  • Target Identity: The unique identity of the target asset in Atlan within the connection. Represents the asset's qualified name path, excluding the connection portion. The format typically follows a hierarchical structure such as database/schema/view or database/schema/table/column.

    Example: db2/schema2/target_view (the path to the target asset within the connection)

  • Target name: The simple display name of the target asset. Represents the name that appears in the Atlan interface.

    Example: target_view (the display name of the target asset)

  • Transformation connector: The type of connector for the transformation or data movement process itself, indicating the kind of system that performs the transformation. Must match a valid connector type in Atlan, such as dbt, mulesoft, airflow, or spark.

    Example: mulesoft (indicates the transformation is performed by MuleSoft)

  • Transformation connection: The name of the connection in Atlan that contains the lineage process. Must match an existing connection name in Atlan. This is where the lineage process is created or updated.

    Example: prod-transformations (the connection name for transformation processes)

  • Transformation Identity: The unique identity of the transformation (lineage) process in Atlan within the connection. Represents the process's qualified name path, excluding the connection portion. The format typically follows a hierarchical structure such as workflow/task/transformation or a unique identifier.

    Example: xform_123 (the unique identifier for the transformation process)

  • Transformation name: The simple display name to use when displaying the lineage process in the Atlan interface. Represents the name that appears in lineage graphs and process listings.

    Example: source_table > target_view (a descriptive name for the transformation process)

Matching behavior

When importing, assets are matched with existing objects in Atlan using the following rules:

  • Asset matching: Both the asset's qualified name (constructed from connector, connection, and identity) and type must match an existing asset.
  • Case sensitivity: Matching can be performed case-sensitively (default) or case-insensitively, depending on the Case-sensitive match for assets option configuration.

CSV data must align with these rules to successfully match existing assets. If an asset doesn't match and creation is permitted (via the Unknown asset handling option), a new asset is created.

Common fields

You can use these common fields on lineage processes to add additional metadata beyond the required fields.

  • description: Optional description of the lineage process.

    Example: ETL process that transforms raw customer data into analytics-ready format

  • userDescription: Optional user-provided description of the lineage process.

    Example: Custom transformation that aggregates daily sales data into monthly summaries

  • sql: Optional SQL query giving details of how the lineage was derived.

    Example: CREATE VIEW target_view AS SELECT * from source_table WHERE some_column > 0;

For any attribute that supports multiple values, separate each value by a newline within the cell.

You can also supply any number of additional columns beyond the required ones. Column names must match one of the lineage process properties or more general asset properties.

Sample CSV file

Download a sample CSV file to understand the required structure: Download sample lineage CSV

Sample file disclaimer

This sample file shows the structure and format only. It may not import as-is and is merely a template for creating your own CSV files.

See also