Lineage Builder App
The Lineage Builder app enables you to create lineage relationships between any source and target assets in Atlan by importing lineage data from a CSV file. This app is useful for adding custom lineage connections that aren't automatically discovered by Atlan's connectors, or for injecting lineage from transformation tools that Atlan doesn't mine out-of-the-box.
The app processes each row in your CSV file to create or update lineage processes that connect source assets to target assets. You can configure how the app handles assets that don't exist in Atlan, whether to create partial assets that appear only in lineage, or full assets that are discoverable through search.
Configuration
This section defines the fields required for workflow setup.
Workflow name
Specifies a unique and descriptive name to identify the workflow configuration in the Atlan interface. This name appears in the workflow list and helps distinguish it from other lineage workflows.
Example:
production-etl-lineage-import
Source
Defines how you want to provide the input CSV file containing lineage data to be processed.
- Direct upload
- Object storage
Upload a CSV file directly into Atlan from your local system. This method is recommended for ad-hoc or one-time imports.
-
Lineage file: The CSV file containing lineage details to load. The file must follow the format described in the Lineage builder CSV file section.
-
File size is generally limited to ~10 MB. For larger files, use the Object storage option described below.
Use this option to fetch the lineage CSV file from a cloud object store such as Amazon S3, Google Cloud Storage (GCS), or Azure Data Lake Storage (ADLS). Object storage is recommended for larger files, automated pipelines, or recurring imports where the lineage file is generated or updated outside Atlan. It also provides a more reliable and scalable alternative when the file size or access pattern makes direct uploads less suitable.
When you select Object storage, the workflow retrieves the CSV file from a specific path or object location within your storage system. The file must follow the format described in the Lineage builder CSV file section, regardless of where it's stored.
For detailed information on configuring storage credentials, access methods, and required fields for each provider, see the general Object storage configuration for apps guide, which applies to S3, GCS, and ADLS-based imports.
Options
This section defines the fields required to build lineage from a CSV file.
Unknown asset handling
Specifies how you want to handle source and target assets in the input file that don't match any assets in Atlan.
- Skip them
- Create partial assets
- Create full assets
Any source or target assets in the input file that don't match any asset in Atlan are skipped. Those lineage rows aren't loaded. Use this option when you want to create lineage only for assets that already exist in Atlan, ensuring data quality and preventing orphaned lineage relationships.
Example: Your CSV file contains lineage from RAW.ORDERS (exists in Atlan) to STAGING.ORDERS_NEW (doesn't exist in Atlan). With Skip them selected, the entire lineage row is skipped because the target asset doesn't exist. No lineage is created, and STAGING.ORDERS_NEW isn't created in Atlan.
For any source or target asset in the input file that doesn't match any asset in Atlan, create a partial asset. These assets only appear in lineage and can't be searched or detailed through the sidebar. They can be later "resolved" into full assets which are then discoverable and visible in the asset sidebar.
Use this option when you need to represent lineage for assets that exist in external systems but haven't been fully ingested into Atlan yet.
To learn more about what partial assets are and how they work, see What are partial assets.
The lineage builder app can create partial assets for lineage at the container level (table, view, materialized view), but not at the child (field) level (column, etc). If you want to create field-level lineage using partial assets, you must first create those field-level partial assets using an alternative App or method.
Example: Your CSV file contains lineage from RAW.ORDERS (exists in Atlan) to EXTERNAL_SYSTEM.CUSTOMER_DATA (doesn't exist in Atlan). With Create partial assets selected, the lineage is created, and EXTERNAL_SYSTEM.CUSTOMER_DATA appears as a partial asset in the lineage graph. You can see the lineage connection, but you can't search for or open the partial asset in the asset sidebar. Later, when you fully ingest this asset through a connector, you can resolve the partial asset into a full one.
For any source or target asset in the input file that doesn't match any asset in Atlan, create a full asset. These assets behave like any other: they're discoverable through search and appear in the asset sidebar.
Use this option when you want to create both the assets and their lineage relationships in a single import operation.
The lineage builder app can create full assets at the container level (table, view, materialized view), but not lineage at the child (field) level (column, etc). If you want to create field-level lineage using new full assets, you must first create those field-level full assets using an alternative App or method.
Example: Your CSV file contains lineage from RAW.ORDERS (exists in Atlan) to STAGING.ORDERS_NEW (doesn't exist in Atlan). With Create full assets selected, the lineage is created, and STAGING.ORDERS_NEW is created as a full asset in Atlan. You can immediately search for STAGING.ORDERS_NEW in the asset sidebar, view its details, and see it in lineage graphs.
Fail on errors
Specifies whether an invalid value in a field causes the import to fail or logs a warning and proceeds.
- Yes: The import as a whole is marked as failed if any row contains invalid data. Lineage for any rows after the failure is detected may not be created. It won't rollback any rows that have already been processed up to that point.
- No: Invalid values log a warning, the app skips that row, and then continues processing remaining rows.
Example: If a row contains an invalid connection name, selecting Yes fails the entire import, while selecting No skips that row and processes the rest of the file.
Case-sensitive match for assets
Determines whether attempts to match assets are done case-sensitively or case-insensitively.
- Yes: Asset matching is case-sensitive. For example,
Ordersmatches onlyOrdersand notorders. - No: Asset matching is case-insensitive. For example,
Ordersmatches bothOrdersandorders.
Use case-sensitive matching when your source systems have strict naming conventions where case differences indicate different assets.
Field separator
The single character that's used to separate values in the input file.
- Typically either a comma (
,) or a semicolon (;). - Must match the actual separator used in your CSV file.
Example:
,
Batch size
The maximum number of rows of input to process per underlying API call.
- Larger batch sizes can improve performance but may increase memory usage.
- Smaller batch sizes provide better error isolation but may be slower.
Example:
100
Lineage builder CSV file
The Lineage Builder CSV file defines lineage relationships between source and target assets. Each row represents a single lineage process to be created or updated.
This app won't create any connections. The connections referenced in the CSV file must already exist in Atlan before running the workflow.
Required fields
-
Source type: The type of the source (input, upstream) asset in the lineage. Must match a valid asset type in Atlan, such as
Table,View,MaterialisedView,Column, orS3Object.Example:
Table(specifies the source asset is a table) -
Source connector: The type of connector for the source asset, indicating the kind of source system. Must match a valid connector type in Atlan, such as
snowflake,mssql,postgresql, ors3.Example:
mssql(indicates the source is from Microsoft SQL Server) -
Source connection: The name of the connection in Atlan that contains the source asset. Must match an existing connection name in Atlan. If the asset doesn't exist and creation is permitted, this is the connection in which the source asset is created.
Example:
app-raw(the connection name in Atlan) -
Source Identity: The unique identity of the source asset in Atlan within the connection. Represents the asset's qualified name path, excluding the connection portion. The format typically follows a hierarchical structure such as
database/schema/tableordatabase/schema/table/column.Example:
db1/schema1/source_table(the path to the source asset within the connection) -
Source name: The simple display name of the source asset. Represents the name that appears in the Atlan interface.
Example:
source_table(the display name of the source asset) -
Target type: The type of the target (output, downstream) asset in the lineage. Must match a valid asset type in Atlan, such as
Table,View,MaterialisedView,Column, orS3Object.Example:
View(specifies the target asset is a view) -
Target connector: The type of connector for the target asset, indicating the kind of target system. Must match a valid connector type in Atlan, such as
snowflake,mssql,postgresql, ors3.Example:
snowflake(indicates the target is in Snowflake) -
Target connection: The name of the connection in Atlan that contains the target asset. Must match an existing connection name in Atlan. If the asset doesn't exist and creation is permitted, this is the connection in which the target asset is created.
Example:
prod-analytics(the connection name in Atlan) -
Target Identity: The unique identity of the target asset in Atlan within the connection. Represents the asset's qualified name path, excluding the connection portion. The format typically follows a hierarchical structure such as
database/schema/viewordatabase/schema/table/column.Example:
db2/schema2/target_view(the path to the target asset within the connection) -
Target name: The simple display name of the target asset. Represents the name that appears in the Atlan interface.
Example:
target_view(the display name of the target asset) -
Transformation connector: The type of connector for the transformation or data movement process itself, indicating the kind of system that performs the transformation. Must match a valid connector type in Atlan, such as
dbt,mulesoft,airflow, orspark.Example:
mulesoft(indicates the transformation is performed by MuleSoft) -
Transformation connection: The name of the connection in Atlan that contains the lineage process. Must match an existing connection name in Atlan. This is where the lineage process is created or updated.
Example:
prod-transformations(the connection name for transformation processes) -
Transformation Identity: The unique identity of the transformation (lineage) process in Atlan within the connection. Represents the process's qualified name path, excluding the connection portion. The format typically follows a hierarchical structure such as
workflow/task/transformationor a unique identifier.Example:
xform_123(the unique identifier for the transformation process) -
Transformation name: The simple display name to use when displaying the lineage process in the Atlan interface. Represents the name that appears in lineage graphs and process listings.
Example:
source_table > target_view(a descriptive name for the transformation process)
Matching behavior
When importing, assets are matched with existing objects in Atlan using the following rules:
- Asset matching: Both the asset's qualified name (constructed from connector, connection, and identity) and type must match an existing asset.
- Case sensitivity: Matching can be performed case-sensitively (default) or case-insensitively, depending on the Case-sensitive match for assets option configuration.
CSV data must align with these rules to successfully match existing assets. If an asset doesn't match and creation is permitted (via the Unknown asset handling option), a new asset is created.
Common fields
You can use these common fields on lineage processes to add additional metadata beyond the required fields.
-
description: Optional description of the lineage process.
Example:
ETL process that transforms raw customer data into analytics-ready format -
userDescription: Optional user-provided description of the lineage process.
Example:
Custom transformation that aggregates daily sales data into monthly summaries -
sql: Optional SQL query giving details of how the lineage was derived.
Example:
CREATE VIEW target_view AS SELECT * from source_table WHERE some_column > 0;
For any attribute that supports multiple values, separate each value by a newline within the cell.
You can also supply any number of additional columns beyond the required ones. Column names must match one of the lineage process properties or more general asset properties.
Sample CSV file
Download a sample CSV file to understand the required structure: Download sample lineage CSV
This sample file shows the structure and format only. It may not import as-is and is merely a template for creating your own CSV files.