Skip to main content

Source asset type
App

The Source asset type property is part of the Lineage Generator (no transformations) app. It defines the type of asset from which lineage originates in Atlan. This reference documentation explains what each supported source asset type represents, how the app parses its qualified names, how parsing behavior changes when related properties such as Match on schema, Case sensitive match, File segmentation, or Regex transformations are enabled or disabled.

The same parsing rules apply to both Source asset type and target asset type. While this page focuses on source assets, the information also applies when configuring the Target asset type.

ADLS object

ADLS objects represent files stored in Azure Data Lake Storage. Qualified names include the connection, container, folder hierarchy, and file name.

  • File path segmentation can be applied using File advanced separator and File advanced position to retain folder context.
  • Match on schema isn't applied to file-based assets.
  • Regex transformations are applied after segmentation to remove suffixes or standardize naming.

Example: If you want to extract only the file name without any folder context, use the default configuration. This is useful when you need to match files across different folder structures:

default/adls/.../container/folder/data.csv → data.csv

Example: If you want to retain folder context for better organization, enable file segmentation. This helps when you need to distinguish between files in different environments or stages:

default/adls/.../container/env/staging/customers.csv → staging/customers.csv

Calculation view

Calculation views are relational assets used in systems like SAP HANA. These views combine multiple tables or other views using business logic, calculations, and aggregations to create a unified data model for reporting and analytics.

  • Default parsing strips connection, database, and schema, leaving only the view name.
  • If Match on schema = Yes, schema is included in the key for more precise lineage tracking.
  • Case sensitivity applies depending on system configuration and can affect matching across different environments.
  • Regex transformations can be used to standardize naming conventions or remove system-specific prefixes.

Example: If you want to match calculation views across different schemas with the same name, use the default configuration. This is useful when you have standardized view names across multiple schemas:

default/sap/.../DATABASE/SCHEMA/VIEW → VIEW

Example: If you need to distinguish between calculation views with the same name in different schemas, enable schema matching. This ensures proper lineage tracking when views have identical names but different contexts:

default/sap/.../DATABASE/SCHEMA/VIEW → SCHEMA/VIEW

Example: If you want to clean up SAP-specific naming conventions, apply regex transformations to standardize view names across your lineage:

SAP_CALC_VIEW_CUSTOMERS → CUSTOMERS

Column

Columns represent individual fields inside tables or views. They're the fundamental building blocks of data structures and contain the actual data values. Column-level lineage is crucial for understanding how individual data elements flow through your data pipeline, from source systems to final analytics.

  • Default parsing includes table and column names, providing context about which table contains the column.
  • If Match on schema = Yes, schema is included for more granular lineage tracking across different database schemas.
  • Regex transformations can be used for column name cleanup, standardization, or alignment with business terminology.
  • Case sensitivity is important for systems where column names may have different casing conventions.

Example: If you want to match columns across different schemas with the same table and column names, use the default configuration. This is useful for cross-schema column lineage:

default/snowflake/.../DB/SCHEMA/TABLE/COLUMN → TABLE/COLUMN

Example: If you need to distinguish between columns with the same name in different schemas, enable schema matching. This ensures accurate lineage when columns exist in multiple schemas:

default/snowflake/.../DB/SCHEMA/TABLE/COLUMN → SCHEMA/TABLE/COLUMN

Example: If you want to standardize column naming conventions across different systems, apply regex transformations to align warehouse columns with source system fields:

CUSTOMER_ID_RAW → CUSTOMER_ID

DynamoDB table

DynamoDB tables are schemaless key-value store assets.

  • Parsing extracts the table name only.
  • Case sensitivity determines whether differently cased names are matched separately.

Example: If you want to extract just the table name for lineage matching, use the default configuration. This simplifies matching across different DynamoDB instances:

default/dynamodb/.../CUSTOMERS → CUSTOMERS

GCS object

GCS objects represent files stored in Google Cloud Storage. Qualified names include the connection, bucket, folder hierarchy, and object.

  • Default parsing extracts only the object (file) name.
  • File path segmentation enables keeping multiple trailing parts of the path.
  • Regex transformations apply after segmentation.

Example: If you want to extract only the file name without folder context, use the default configuration. This is useful for matching files across different bucket structures:

default/gcs/.../bucket/folder/object.json → object.json

Example: If you need to retain folder context for better file organization, enable file segmentation. This helps when you want to distinguish between files in different environments or project folders:

default/gcs/.../bucket/env/prod/sales/data.json → prod/sales/data.json

Kafka topic

Kafka topics represent message streams. Qualified names include the connection and topic name.

  • Parsing extracts only the topic name.
  • Case sensitivity determines if topics with the same name but different case are considered matches.

Example: If you want to extract just the topic name for lineage matching, use the default configuration. This simplifies matching across different Kafka clusters:

default/kafka/.../topic/orders_events → orders_events

Looker field

Looker fields represent individual dimensions or measures in LookML models.

  • Parsing extracts only the field name portion.
  • Regex transformations can align field names with warehouse columns.

Example: If you want to extract only the field name for lineage matching, use the default configuration. This helps align Looker fields with their corresponding warehouse columns:

default/looker/.../distribution_centers.location → location

Looker view

Looker views represent groupings of fields in LookML models.

  • Parsing extracts only the view name.
  • Regex transformations can clean prefixes or suffixes in view names.

Example: If you want to extract only the view name for lineage matching, use the default configuration. This simplifies matching Looker views with their underlying data sources:

default/looker/.../distribution_centers → distribution_centers

Materialized view

Materialized views are persisted relational query results.

  • Default parsing strips connection, database, and schema.
  • If Match on schema = Yes, schema is included.
  • Case sensitivity applies depending on system rules.

Example: If you want to match materialized views across different schemas with the same name, use the default configuration. This is useful when you have standardized view names across multiple schemas:

default/redshift/.../DB/SCHEMA/MV_NAME → MV_NAME

Example: If you need to distinguish between materialized views with the same name in different schemas, enable schema matching. This ensures proper lineage tracking when views exist in multiple schemas:

default/redshift/.../DB/SCHEMA/MV_NAME → SCHEMA/MV_NAME

MongoDB collection

MongoDB collections represent groups of documents within databases.

  • Default parsing extracts only the collection name.
  • If Match on schema = Yes, database and collection are included.
  • Case sensitivity can affect results when collection names overlap.

Example: If you want to match collections across different databases with the same name, use the default configuration. This is useful when you have standardized collection names across multiple databases:

default/mongodb/.../DB/COLLECTION → COLLECTION

Example: If you need to distinguish between collections with the same name in different databases, enable database matching. This ensures accurate lineage when collections exist in multiple databases:

default/mongodb/.../DB/COLLECTION → DB/COLLECTION

Power BI column

Power BI columns represent fields within BI tables.

  • Default parsing extracts both table and column.
  • Regex transformations can rename BI fields to match warehouse columns.

Example: If you want to extract both table and column names for lineage matching, use the default configuration. This helps align Power BI fields with their corresponding warehouse columns:

default/powerbi/.../TABLE/COLUMN → TABLE/COLUMN

Power BI table

Power BI tables represent datasets inside BI models.

  • Default parsing extracts only the table portion.
  • Regex transformations can clean or rename tables to align with upstream assets.

Example: If you want to extract only the table name for lineage matching, use the default configuration. This simplifies matching Power BI tables with their underlying data sources:

default/powerbi/.../TABLE → TABLE

S3 object

S3 objects represent files stored in Amazon S3 buckets. Qualified names include the connection, bucket, folder hierarchy, and object.

  • Default parsing extracts only the file name.
  • File segmentation properties can keep multiple trailing segments.
  • Regex transformations are applied after segmentation to remove environment labels or extensions.
  • Match on schema isn't applied to file-based assets.

Example: If you want to extract only the file name without folder context, use the default configuration. This is useful for matching files across different bucket structures:

default/s3/.../mybucket/folder/data.csv → data.csv

Example: If you need to retain folder context for better file organization, enable file segmentation. This helps when you want to distinguish between files in different environments or project folders:

default/s3/.../mybucket/env/staging/customers/data.csv → staging/customers/data.csv

Example: If you want to clean up environment-specific suffixes after segmentation, apply regex transformations. This helps standardize file names while maintaining folder context:

staging/customers/data.csv → customers/data.csv

Salesforce field

Salesforce fields represent attributes within Salesforce objects.

  • Parsing removes connection and org identifiers, leaving only object and field.
  • Regex transformations can normalize Salesforce field names to align with warehouse fields.

Example: If you want to extract object and field names for lineage matching, use the default configuration. This helps align Salesforce fields with their corresponding warehouse columns:

default/salesforce/.../ORG/ACCOUNT/ID → ACCOUNT/ID

Salesforce object

Salesforce objects represent entities such as Accounts or Contacts.

  • Parsing removes connection and org identifiers, leaving only the object.
  • Regex transformations can handle additional suffixes or prefixes across systems.

Example: If you want to extract only the object name for lineage matching, use the default configuration. This simplifies matching Salesforce objects with their corresponding warehouse tables:

default/salesforce/.../ORG/ACCOUNT → ACCOUNT

Table

Tables are standard relational assets in systems like Snowflake, BigQuery, and Redshift.

  • Default parsing removes connection, database, and schema.
  • If Match on schema = Yes, schema is included in the parsed key.
  • Case sensitivity determines whether same-named tables with different casing are treated separately.
  • Regex transformations can clean or align names.

Example: If you want to match tables across different schemas with the same name, use the default configuration. This is useful when you have standardized table names across multiple schemas:

default/snowflake/.../DB/SCHEMA/TABLE → TABLE

Example: If you need to distinguish between tables with the same name in different schemas, enable schema matching. This ensures proper lineage tracking when tables exist in multiple schemas:

default/snowflake/.../DB/SCHEMA/TABLE → SCHEMA/TABLE

View

Views are relational query-based objects.

  • Default parsing strips connection, database, and schema.
  • If Match on schema = Yes, schema is included.
  • Case sensitivity determines whether views with different casing match.

Example: If you want to match views across different schemas with the same name, use the default configuration. This is useful when you have standardized view names across multiple schemas:

default/bigquery/.../PROJECT/DB/SCHEMA/VIEW → VIEW

Example: If you need to distinguish between views with the same name in different schemas, enable schema matching. This ensures proper lineage tracking when views exist in multiple schemas:

default/bigquery/.../PROJECT/DB/SCHEMA/VIEW → SCHEMA/VIEW