Skip to main content

Lineage and asset loader
App

The Lineage and asset loader app creates lineage relationships between assets (and optionally creates the assets themselves) based on source-to-target mappings defined in a CSV file stored in Amazon S3. It supports both relational database and S3 object asset types, making it useful for tools that Atlan doesn't natively connect to or where lineage can't be extracted automatically. This reference provides complete configuration details for the Lineage and asset loader app.

Configuration

This section defines the fields required for workflow setup.

Workflow name

Specifies a unique and descriptive name to identify this workflow configuration in the Atlan interface. This name appears in the workflow list and helps distinguish it from other lineage workflows.

Example:

prod-etl-lineage-loader

Input

Specifies the method by which the app accesses the input CSV mapping file. S3 Bucket is the only supported option.

Authentication

Selects the AWS authentication method used to access the S3 bucket containing the mapping file.

Authenticates using an IAM user's access key and secret. Use this method when you manage a dedicated IAM user with an attached policy that grants access to the S3 bucket.

  • AWS Access Key: Access key ID for the IAM user with permission to read from the S3 bucket.
  • AWS Access Secret: Secret access key paired with the access key ID in the preceding field.

To set up IAM user authentication:

  1. Create an IAM user in the AWS Identity and Access Management console.
  2. On the Set permissions page, attach your S3 access policy to the user.
  3. After the user is created, copy the access key ID and secret access key for use in the workflow.

S3 bucket settings

Defines the location of the mapping file in Amazon S3.

  • S3 Bucket Name: Name of the S3 bucket where the input CSV file is stored.

  • Mapping filename / key: Full key (path and filename) of the CSV mapping file within the bucket, including any prefix.

    Example: lineage/mappings/etl-to-warehouse.csv

  • S3 Region: AWS region where the S3 bucket is located.

    Example: us-east-1

Connection settings

Defines how the workflow tracks the assets and lineage it creates.

Connection qualified name

Specifies the qualified name of the connection where the workflow creates lineage process assets. This connection must already exist in Atlan before running the workflow. It can't be created by the workflow itself.

Example:

default/custom-etl/1234567890

Name

Specifies the name of the custom metadata set the workflow uses to tag assets it creates or manages. If this custom metadata set doesn't already exist, the workflow creates it and locks it in the UI to prevent accidental modification.

Example:

ETL Lineage Loader

Instance name

Specifies the name of the custom metadata property within the set identified in Name that stores the workflow instance identity on each managed asset.

Example:

Loader Instance

Instance unique ID

Assigns a unique identifier stored on every asset and lineage process created by this workflow run. On subsequent runs, the workflow uses this ID to locate the assets it authored, to update metadata or deprecate records no longer present in the mapping file.

Must be unique per workflow

Each workflow configuration must use a distinct Instance unique ID. Reusing an ID across configurations causes the workflow to incorrectly manage assets from other runs.

Example:

prod-etl-loader-v1

Lineage and asset loader CSV file

The input CSV file defines the source-to-target mappings used to create lineage and assets. Each row represents one lineage relationship. Fields fall into four groups: source identifiers, target identifiers, asset creation controllers, and lineage metadata.

Each file supports only one source type and one target type.

Source identifiers

Regardless of source type, the following two fields are required in every row:

FieldDescription
SOURCE_TYPEAsset type of the source. Use Table for relational database assets or S3 Object for S3 assets.
SOURCE_CONNQualified name of the connection where source assets reside or are created.

Additional fields depend on the source asset type:

Use the following fields when source assets are relational database tables.

FieldDescription
SOURCE_DBName of the database containing the source table.
SOURCE_SCHEMAName of the schema containing the source table.
SOURCE_TABLEName of the source table.

Example row (database source):

SOURCE_TYPE,SOURCE_CONN,SOURCE_DB,SOURCE_SCHEMA,SOURCE_TABLE
Table,default/snowflake/1234567890,ANALYTICS,PUBLIC,RAW_ORDERS

Target identifiers

Regardless of target type, the following two fields are required in every row:

FieldDescription
TARGET_TYPEAsset type of the target. Use Table for relational database assets or S3 Object for S3 assets.
TARGET_CONNQualified name of the connection where target assets reside or are created.

Additional fields depend on the target asset type:

Use the following fields when target assets are relational database tables.

FieldDescription
TARGET_DBName of the database containing the target table.
TARGET_SCHEMAName of the schema containing the target table.
TARGET_TABLEName of the target table.

Asset creation controllers

Controls whether the workflow creates source or target assets when they don't already exist in Atlan. Lineage generates only for rows where both the source and target assets exist in Atlan (either pre-existing or created by the workflow in the same run).

FieldValuesDescription
CREATE_SOURCE_IF_NOT_EXISTSTRUE / FALSEWhen TRUE, the workflow creates the source asset if it doesn't exist. When FALSE, the source asset must already exist in Atlan for lineage to be generated on that row.
CREATE_TARGET_IF_NOT_EXISTSTRUE / FALSEWhen TRUE, the workflow creates the target asset if it doesn't exist. When FALSE, the target asset must already exist in Atlan for lineage to be generated on that row.

Lineage metadata fields

Attaches descriptive metadata to the lineage process asset connecting each source-to-target pair.

FieldDescription
DESCRIPTIONHuman-readable description saved on the lineage process asset. Updates on subsequent runs if the value changes in the mapping file.
EXPRESSIONSQL statement or expression saved on the lineage process asset. Updates on subsequent runs if the value changes in the mapping file.

Both fields are optional. Rows without these values create lineage with no description or expression on the process asset.

See also

  • Load lineage and assets from CSV: Step-by-step guide for preparing the mapping file, configuring S3 access, and running the workflow.
  • Lineage Builder: Reference for creating lineage from CSV uploaded directly or via object storage.
  • Alert propagation: Reference for propagating alert metadata through lineage to downstream assets.