Skip to main content

What does Atlan crawl from Talend

Atlan crawls the following assets and metadata from Talend projects stored in GitHub or Atlassian Stash (Bitbucket Server).

Lineage

The Talend connector calculates lineage at both table level and column level.

Table-level lineage

Table-level lineage tracks data flow between:

  • Source databases → Talend jobs → Target databases
  • Source files → Talend jobs → Target databases
  • Source databases → Talend jobs → Target files

Example:

MySQL.sales.customers → [Talend Job: ETL_Customer_Data] → Snowflake.analytics.dim_customer

Column-level lineage

Column-level lineage tracks transformations for individual columns:

  • Column mapping through tMap components
  • Aggregations through tAggregateRow components
  • Joins through tJoin components
  • Filters and transformations

Example:

MySQL.sales.customers.first_name → [tMap: concat] → Snowflake.analytics.dim_customer.full_name
MySQL.sales.customers.last_name → [tMap: concat] → Snowflake.analytics.dim_customer.full_name

Notation

  • 🔀 = Includes lineage support
  • 📊 = Includes column-level details

For details on how lineage is calculated and known limitations, see Lineage.

FlowProject

Atlan maps Talend projects to its FlowProject asset type.

Atlan propertyFile nameTalend property
nametalend.projectlabel
qualifiedNameN/ACalculated
flowIdtalend.projectxmi:id
descriptiontalend.projectdescription
assetUserDefinedTypeN/AHard coded
connectorNameN/AHard coded
connectionNameN/AUI driven
connectionQualifiedNameN/ACalculated
lastSyncRunAtN/ACalculated
lastSyncWorkflowNameN/AWorkflow ID
lastSyncRunN/ARun ID
tenantIdN/AHard coded

FlowControlOperation

Atlan maps Talend jobs to its FlowControlOperation asset type.

Atlan propertyFile nameTalend property
name.propertieslabel
qualifiedNameN/ACalculated
flowId.propertiesxmi:id
description.propertiesdescription
flowProjectNametalend.projectlabel
flowProjectQualifiedNametalend.projectCalculated
assetUserDefinedTypeN/AHard coded
connectorNameN/AHard coded
connectionNameN/AUI driven
connectionQualifiedNameN/ACalculated
lastSyncRunAtN/ACalculated
lastSyncWorkflowNameN/AWorkflow ID
lastSyncRunN/ARun ID
tenantIdN/AHard coded

FlowReusableUnit

Atlan maps Talend reusable units (shared components and routines) to its FlowReusableUnit asset type.

Atlan propertyFile nameTalend property
nameN/ACalculated
qualifiedNameN/ACalculated
flowIdN/ACalculated
descriptionN/AHard coded
flowProjectNametalend.projectlabel
flowProjectQualifiedNametalend.projectCalculated
flowDatasetCountN/ACalculated
flowControlOperationCountN/ACalculated
assetUserDefinedTypeN/AHard coded
connectorNameN/AHard coded
connectionNameN/AUI driven
connectionQualifiedNameN/ACalculated
lastSyncRunAtN/ACalculated
lastSyncWorkflowNameN/AWorkflow ID
lastSyncRunN/ARun ID
tenantIdN/AHard coded

FlowDataset

Atlan maps Talend job components to its FlowDataset asset type. Components represent individual operations within a job, including transformations, database connections, and file operations.

Atlan propertyFile nameTalend property
name.itemNode.componentName
qualifiedNameN/ACalculated
flowIdN/Axmi:id
flowProjectNametalend.projectlabel
flowProjectQualifiedNametalend.projectCalculated
flowReusableUnitNameN/ACalculated
flowReusableUnitQualifiedNameN/ACalculated
assetUserDefinedTypeN/AHard coded
flowType.itemNode.componentName
flowQueryN/ACalculated
flowExpressionN/ACalculated
connectorNameN/AHard coded
connectionNameN/AUI driven
connectionQualifiedNameN/ACalculated
lastSyncRunAtN/ACalculated
lastSyncWorkflowNameN/AWorkflow ID
lastSyncRunN/ARun ID
tenantIdN/AHard coded

Component types

The connector crawls Talend job components across multiple categories:

  • Transformation components: Data mapping, filtering, joins, aggregations, sorting, and data normalization operations
  • Database components: Input/output operations for various database systems (MySQL, Oracle, SQL Server, PostgreSQL, and generic JDBC connections)
  • File components: Operations for delimited files, Excel, JSON, XML, and other file formats
  • Orchestration components: Job execution control, loops, and workflow management

The connector's component support is continuously expanding. Contact Atlan support for the most current list of supported component types.

FlowField

Atlan maps Talend component fields (columns and variables) to its FlowField asset type.

Atlan propertyFile nameTalend property
name.itemCalculated
qualifiedNameN/ACalculated
flowIdN/Axmi:id
flowProjectNametalend.projectlabel
flowProjectQualifiedNametalend.projectCalculated
flowReusableUnitNameN/ACalculated
flowReusableUnitQualifiedNameN/ACalculated
assetUserDefinedTypeN/AHard coded
connectorNameN/AHard coded
connectionNameN/AUI driven
connectionQualifiedNameN/ACalculated
lastSyncRunAtN/ACalculated
lastSyncWorkflowNameN/AWorkflow ID
lastSyncRunN/ARun ID
tenantIdN/AHard coded
flowDataType.itemNode.<metadata<column.type
flowDatasetName.itemNode.componentName
flowDatasetQualifiedNameN/ACalculated

FlowDatasetOperation

Atlan maps Talend dataset operations to its FlowDatasetOperation asset type.

Atlan propertyFile nameTalend property
nameN/Alabel
qualifiedNameN/ACalculated
flowIdN/Axmi:id
descriptionN/AHard coded
flowProjectNametalend.projectlabel
flowProjectQualifiedNametalend.projectCalculated
inputsN/ACalculated
outputsN/ACalculated
assetUserDefinedTypeN/AHard coded
connectorNameN/AHard coded
connectionNameN/AUI driven
connectionQualifiedNameN/ACalculated
lastSyncRunAtN/ACalculated
lastSyncWorkflowNameN/AWorkflow ID
lastSyncRunN/ARun ID
tenantIdN/AHard coded

Process

Atlan maps Talend processes to its Process asset type for table-level lineage tracking.

Atlan propertyFile nameTalend property
name.itemCalculated
qualifiedNameN/ACalculated
connectorNameN/AHard coded
connectionNameN/AUI driven
connectionQualifiedNameN/ACalculated
lastSyncRunAtN/ACalculated
lastSyncWorkflowNameN/AWorkflow ID
lastSyncRunN/ARun ID
tenantIdN/AHard coded
flowOrchestratedBy.qualifiedNameN/ACalculated
inputsN/ACalculated
outputsN/ACalculated

ColumnProcess

Atlan maps Talend column processes to its ColumnProcess asset type for column-level lineage tracking.

Atlan propertyFile nameTalend property
name.itemCalculated
qualifiedNameN/ACalculated
connectorNameN/AHard coded
connectionNameN/AUI driven
connectionQualifiedNameN/ACalculated
lastSyncRunAtN/ACalculated
lastSyncWorkflowNameN/AWorkflow ID
lastSyncRunN/ARun ID
tenantIdN/AHard coded
process.qualifiedNameN/ACalculated
inputsN/ACalculated
outputsN/ACalculated

See also

  • Crawl Talend assets: Configure and run the workflow to discover and catalog Talend assets
  • Set up Talend: Configure GitHub or Stash access tokens for the Talend connector