Lineage generator (no transformations) App
The Lineage Generator (No Transformations) app automatically detects matching or similar assets between two connections in Atlan and generates lineage links between them. This enables teams to establish upstream and downstream data flows without requiring transformation logic, making lineage creation faster and more consistent across your data estate.
The app provides three capabilities:
-
Preview lineage: Validate matches before creating them.
-
Generate lineage: Bulk-create relationships between matched assets.
-
Delete lineage: Clean up or roll back lineage previously created by the app.
It works across multiple asset types, including tables, views, files, BI assets, and more. You can apply custom matching rules using case sensitivity, schema matching, and regex-based name replacements to make sure accurate lineage generation. This reference provides complete configuration details to help you set up and customize lineage generation.
Access
The Lineage Generator (No Transformations) app isn't enabled by default. To use this app, contact Atlan support and request it be added to your tenant.
Configurations
This section explains the properties you can configure in the Lineage Generator (no transformations) app.
Workflow name
Specifies the display name for the workflow in Atlan. This name is used to identify the workflow in the UI and logs.
Choose a name that clearly reflects the purpose or scope of the lineage generation run.
Example:
Sales Data Lineage Generation
Source asset type
Specifies the type of the input assets from which lineage must originate. This determines which assets in Atlan are scanned and matched with the Target asset type to generate lineage.
Supported types include:
- Relational: Table, View, Materialized View, Calculation View, Column
- Other sources: MongoDB Collection, Salesforce Object, Salesforce Field, S3 / ADLS / GCS Object, Power BI Table, Power BI Column, Kafka Topic, Looker Field, Looker View
For detailed parsing rules, and examples for each supported type, see the Source asset type reference.
When Match on schema is set to Yes, schema names are included in the matching logic. This means that both schema and asset names must align for lineage to be created.
Schema matching applies only when both the source and target asset types are relational (Table, View, Materialized View, Calculation View, Column) or a MongoDB Collection. If either type is outside these categories, schema matching is ignored.
Example:
- With Match on schema = No,
sales.orders
andmarketing.orders
both map to a downstream target namedorders
. - With Match on schema = Yes, only
sales.orders
connects tosales.orders
, whilemarketing.orders
is treated as separate.
Source qualified name prefix
Specifies the qualified name prefix for the source assets. The qualified name is the unique identifier Atlan assigns to every asset, and the prefix is the first part of this identifier that indicates the connection, environment, and often the schema or directory where the asset resides.
When this property is set, the app only considers source assets whose qualified names begin with the specified prefix. This helps narrow the search scope to a specific connection or subset of assets, improving both accuracy and performance.
You can find this prefix in the qualifiedName field of an asset in Atlan. Open the asset's details, locate its qualifiedName, and copy the portion that represents the schema, folder, or relevant path to use as the prefix. Make sure the prefix matches the exact format of the asset's qualified name in Atlan. Incorrect or incomplete prefixes can result in no matches.
Example:
Snowflake schema: Use a schema-level prefix when you want to generate or validate lineage only for assets inside a specific Snowflake schema.
default/snowflake/1678901234/warehouse_name/database_name/schema_name
S3 bucket folder: Use a folder-level prefix when you want to restrict lineage operations to objects within a particular folder of an S3 bucket.
default/s3/1678904567/bucket-name/folder-name/.../object-name
Target asset type
Specifies the type of assets in the destination connection for which lineage is created. The system searches for assets of this type in the target connection and matches them with source assets according to the configured rules.
The target asset type usually represents the downstream asset in the lineage relationship. For example, when generating lineage between a staging schema in a data warehouse and a BI tool, the target asset type might be Table for the warehouse layer or Dashboard for the BI layer.
If either the Source asset type or Target asset type isn't a relational type (Table, View, Materialized View, Calculation View, or Column) or a MongoDB Collection, the Match on schema option is ignored.
Example: Same technology lineage
Table as the Source asset type and Table as the Target asset type in the same Snowflake environment, mapping tables from a staging schema to a reporting schema.
Example: Cross-technology lineage
Table as the Source asset type in Snowflake and Dataset as the Target asset type in Looker, mapping data warehouse tables to the Looker datasets that use them.
Target qualified name prefix
Specifies the qualified name prefix used to identify target assets for lineage generation. Only assets whose qualified names begin with this prefix are considered when matching them to source assets.
You can find the qualified name prefix in the qualifiedName field of an asset in Atlan. Open the asset’s details, copy its qualifiedName
, and use the portion that represents the schema, folder, or relevant path as the prefix.
Example:
If the target assets are in a Snowflake schema named analytics, the qualified name prefix might look like:
default/snowflake/1234567890/warehouse_name/database_name/analytics
If the target assets are files in an S3 bucket, the prefix might be:
default/s3/1234567890/bucket-name/reports/.../object-name
Case sensitive match
Determines whether asset name matching between the source and target is performed with case sensitivity.
-
Yes: Only matches assets when the names have exactly the same letter case. For example,
Orders
matches onlyOrders
and notorders
. -
No (default): Matches assets regardless of letter case. For example,
Orders
matches bothOrders
andorders
.
This setting is useful when working with systems or naming conventions where case differences indicate different assets, or when aligning with case-sensitive data source rules.
Ignore circular lineage
Specifies whether lineage generation must skip relationships that might create a circular reference between assets.
-
Yes: Prevents the creation of lineage where the source and target assets are connected in a loop, avoiding redundant or misleading paths in lineage graphs.
-
No (default): Enables lineage to be created even if it results in a circular path.
Example: If you generate lineage between staging and reporting tables in Snowflake, you normally expect data to flow one way (staging → reporting). If reporting tables also reference staging tables, a circular path can appear.
When enabled: The lineage is created only in the forward direction, avoiding the circular loop:
Table A (staging) → Table B (reporting)
When not enabled: The lineage includes both directions, creating a loop:
Table A (staging) → Table B (reporting) → Table A (staging)
This option blocks self-referencing lineage paths when the same asset is included in both source and target scopes. To establish valid bidirectional lineage between different assets, such as staging and reporting layers, configure and run the workflow separately for each direction.
Match on schema
Controls whether lineage matching compares both the schema name and the asset name, or only the asset name, when connecting source and target assets.
Schema matching applies only when both the Source asset type and Target asset type are relational objects (Table, View, Materialized View, Calculation View, or Column) or MongoDB Collections. If either asset type falls outside these categories, schema names aren't considered in the matching process.
- Yes: The schema name is included in the matching logic. A source and target asset are considered a match only if both the schema and the asset name align. Useful when the same asset names exist across multiple schemas. Prevents incorrect matches across different domains or functional areas.
Example: If you're working in a multi-schema warehouse where sales.orders
and marketing.orders
both exist, enabling this option makes sure that only sales.orders
maps to sales.orders
.
- No (default): Only the asset name is compared between source and target. Schemas are ignored during matching. Useful when assets with the same name are always intended to be connected, regardless of their schema.
Example: If you're consolidating data and want both sales.orders
and marketing.orders
to connect to a single downstream target orders_combined
, select No so that schema differences don't prevent the match.
Output type
Defines the type of action the app performs after matching source and target assets. You can select one of the following actions:
- Preview Lineage (default): Generates a CSV preview of the potential lineage mappings without applying changes in Atlan. Use this option when validating the configuration before making permanent updates.
Example: If you configure the app to match orders tables across sales
and marketing
schemas, the preview generates a file showing which assets might be connected, but no lineage is created in Atlan.
- Generate Lineage: Creates lineage relationships directly in Atlan between the matched source and target assets.
Example: If the preview showed sales.orders
→ warehouse.orders_combined
and the configuration is approved, running with Generate Lineage adds this lineage in Atlan so it appears in the lineage graph.
Use this option when confident the mappings are correct and ready to be reflected in the workspace.
- Delete Lineage: Removes lineage previously created by this app from Atlan. Only relationships generated by the Lineage Generator (no transformations) app are deleted.
Example: If sales.orders
was earlier mapped to warehouse.orders_combined
but the connection is no longer valid, selecting Delete Lineage removes this relationship from Atlan.
Generate lineage on child assets
Specifies whether lineage is also created between the child assets of the selected source and target asset types.
This option is relevant only when both the Source asset type and Target asset type include child assets, such as tables with columns or Power BI tables with columns. For asset types that don't have a hierarchical structure, the option is ignored.
- Yes: Lineage is created at both the parent level (for example, table-to-table) and the child level (for example, column-to-column).
Example: If sales.orders
(table) is mapped to warehouse.orders_combined
(table), enabling this option also generates lineage for the columns under those tables, such as orders.customer_id
→ orders_combined.customer_id
.
Useful when column-level lineage is required to understand transformations or dependencies in greater detail.
- No (default): Lineage is generated only at the parent level (for example, table-to-table), and no column-level (or child-level) mappings are created.
Example: With the same configuration, lineage connects only the sales.orders
table to the warehouse.orders_combined
table, without adding relationships between individual columns.
Suitable when high-level lineage is sufficient or when child asset mappings may add unnecessary noise.
Name transformation (regex)
If you want to replace specific characters or patterns in asset names so they align across systems, configure the regex fields together. When Match on schema = Yes, use the schema-specific regex fields to handle schema-level differences. The name-only regex fields apply only to object names, such as tables or columns, and aren't used for schema comparisons.
-
Regex to match characters to replace: Defines the regex patterns you want to match in the source asset names.
-
Regex with replacement characters: Defines what those matched patterns must be replaced with.
Example: If you want to change all _id
suffixes into _identifier
to match your target system:
- In Regex to match characters to replace, enter:
_id$
- In Regex with replacement characters, enter:
_identifier
Example: To replace the temporary prefix tmp_
with a standard prefix stg_
:
- In Regex to match characters to replace, enter:
^tmp_
- In Regex with replacement characters, enter:
stg_
Schema name transformation (regex)
Use these properties when schema names differ between environments or systems and need to be standardized for lineage generation.
-
Regex to match characters to replace on the schema: Defines the pattern in the schema name to identify the part that must be replaced.
-
Regex with replacement characters on the schema: Specifies the replacement text to use for each pattern defined earlier.
Both fields must be configured together: each regex pattern requires a corresponding replacement. If one is left blank, the transformation doesn't occur.
Example: If you want to make schema names lowercase across environments:
- In Regex to match characters to replace on the schema, enter:
[A-Z]
- In Regex with replacement characters on the schema, enter:
\L$0
This makes sure schemas like Finance
, HR_DATA
, and SALES
all become lowercase (finance
, hr_data
, sales
) for consistent matching.
Example: If you want to ignore version numbers in schema names:
- In Regex to match characters to replace on the schema, enter:
_[0-9]+$
- In Regex with replacement characters on the schema, leave the field empty to remove the matched text.
This maps schemas like analytics_2023
and analytics_01
into a single logical schema: analytics
.
Using name + schema regex properties
If you want to standardize both schema and object names together, configure the following properties:
-
Regex to match characters to replace on the name + schema: Defines the pattern across the full schema.object identifier to identify the part that must be replaced.
-
Regex with replacement characters on the name + schema: Specifies the replacement text to use for each pattern defined earlier.
These properties apply only when Match on schema = Yes, because the system uses the combined schema.object
key instead of handling the schema and object name separately. When configured, the name + schema regex pair takes precedence over name-only or schema-only regex pairs.
Example: If you want to align schema and table names together, you can configure both the schema and object name in a single regex pair. This is useful when naming conventions include environment-specific prefixes or temporary suffixes that need to be normalized across systems.
- In Regex to match characters to replace on the name + schema, enter:
^(raw_|stg_|prod_)|(_tmp$)
- In Regex with replacement characters on the name + schema, enter:
""
With this configuration, common environment prefixes (raw_
, stg_
, prod_
) and temporary suffixes (_tmp
) are removed:
-
raw_sales.orders_tmp
→sales.orders
-
prod_sales.orders
→sales.orders
Match prefix
Adds a fixed string to the beginning of each source asset name before comparing it to target assets.
This property is useful when target assets follow a consistent naming convention that uses a prefix (such as stg_
, prod_
, or team_
), but source assets don't. For example, if the source asset is orders
and the target asset is stg_orders
, setting the match prefix to stg_
enables them to match.
- If both Match prefix and Match suffix are set, the prefix is applied first, followed by the suffix.
- If Match on schema = Yes, the prefix applies only to the object name, not the schema.
- If regex transformations on name + schema are configured, those transformations take precedence, and the prefix is ignored. If regex transformations on name only are configured, the regex runs first, and then the prefix is added.
- If the differences between source and target names go beyond a simple prefix, regex transformations are the preferred option.
Example: To match a source asset named orders with a target asset named stg_orders
, set Match prefix to stg_
. This adds the prefix to the source name so both align.
Match suffix
Appends a fixed string to the end of each source asset name before comparing it to target assets.
This property is useful when target systems consistently add a suffix (such as _prod
, _stg
, or _2024
) to asset names, but source systems don't. For example, if the source asset is orders
and the target asset is orders_prod
, adding the suffix _prod
enables them to match.
- If both Match prefix and Match suffix are set, the prefix is applied first, followed by the suffix.
- If Match on schema = Yes, the suffix applies only to the object name, not the schema.
- If regex transformations on name + schema are configured, those transformations take precedence, and the suffix is ignored.
- If regex transformations on name only are configured, the regex runs first, then the suffix is added.
- If the naming differences are more complex than a consistent suffix, regex transformations are the preferred option.
Example: To match a source table named orders
with a target table named orders_prod
, set Match suffix to _prod
. This transforms the source name to orders_prod
before comparison, creating a correct match.
File path segmentation (file-based assets only)
For file-based assets (such as S3, ADLS, or GCS objects), use these properties to extract the meaningful parts of a file path for lineage matching.
- File advanced separator: Defines the character used to split the path into segments (for example,
/
in an S3 path or\
in a Windows-style path). - File advanced position: Specifies how many segments to keep from the end of the split path. The count is right-to-left (for example,
3
keeps the last three segments).
These properties are designed to work together:
- The separator defines where to cut the path.
- The position defines which slice of the path to use for matching.
- If only a separator is set, the path is split but the full string is still compared.
- If only a position is set, it has no effect because no split occurs.
- For best results, configure both properties together.
Other transformations (such as regex, prefix, or suffix) are applied after separator and position logic. If Match on schema = Yes, the folder path segments kept by this configuration are treated like a schema. This makes it possible to align files across environments in the same way schemas align tables in databases.
Example: If your S3 folder paths include the environment (such as staging
or prod
), you can use the file segmentation properties to focus only on the meaningful parts of the path.
- In File advanced separator, enter:
/
- In File advanced position, enter:
3
This keeps the last three segments of the path for matching. For example:
arn:aws:s3:::mybucket/staging/customers/data.csv
becomes:
customers/data.csv
so it can correctly align with:
arn:aws:s3:::mybucket/prod/customers/data.csv
Process connection
The Process connection property defines which connection the generated lineage processes belong to. If left blank, processes are automatically assigned to the same connection as the source assets.
This property is particularly useful when the lineage logically represents a transformation or movement between two systems and must live in a neutral or dedicated connection, rather than being tied only to the source.
Process connection works independently from regex, prefix/suffix, and file path segmentation properties. Those affect how matching is computed, while Process connection affects only process placement.
Example: If you want to centralize lineage processes in a dedicated connection called lineage_sandbox
, configure:
lineage_sandbox