Skip to main content

What Atlan crawls from Generic OpenLineage

Atlan extracts lineage and operational metadata from Generic OpenLineage events. This page lists the assets, filters, and field mappings that Atlan supports.

Filters

After integrating Generic OpenLineage, use connector-specific filters in Atlan to quickly find assets. The following filters are supported:

  • Status filter: last run status for an asset
  • Duration filter: last run duration for an asset

Lineage

Atlan creates two asset types from the inputs and outputs arrays in OpenLineage events:

  • Process: represents table-level lineage between input and output datasets
  • ColumnProcess: represents column-level lineage between individual fields

For the full list of supported data sources, see What data sources does Generic OpenLineage support for lineage?


Assets

Atlan maps all OpenLineage jobs—whether parent-level or child-level—to the FlowControlOperation asset type. Each asset includes metadata fields extracted from the corresponding OpenLineage event properties.

Flow control operations

Both parent-level and child-level jobs are mapped to FlowControlOperation, with column-level lineage supported at both levels. The tabs below show the field mappings for each level.

Parent-level jobs (APPLICATION, DAG, WORKFLOW) represent top-level orchestration units such as Airflow DAGs, Spark applications, or Alteryx workflows.

Source propertyAtlan propertyDescription
job.namenameName of the parent job
job.namespace + job.namequalifiedNameUnique identifier for the job in Atlan (connectionQF/jobName)
job.facets.jobType.integrationsourceIntegration source (for example, airflow, spark, alteryx)
job.facets.jobType.jobTypeassetUserDefinedTypeType of the job (for example, Application, Dag, Workflow)
run.facets.parent.job.nameflowControlledByReference to the controlling FlowControlOperation from another pipeline (for example, Airflow task controlling a Spark app)
run.facets.airflow.task.ownersourceOwnersOwner information extracted from the Airflow task facet (Airflow-specific)
run.facets.airflow.task.ownerownerUsersValidated owner usernames extracted from the Airflow task facet (Airflow-specific)
run.facets.airflow.task.descriptiondescriptionDescription of the job extracted from the Airflow task facet (Airflow-specific)
run.facets.airflow.task.tagstagsTags associated with the job extracted from the Airflow task facet (Airflow-specific)
job.namespaceconnectionNameName of the connection, matched from job namespace
job.namespaceconnectionQualifiedNameUnique identifier for the connector instance, matched from job namespace
URL pathconnectorNameName of the connector (generic-openlineage)

OpenLineage run metadata

Atlan captures run-level metadata for every OpenLineage event and maps it to the corresponding flow control operation. This metadata is available for both parent and child jobs and reflects the actual execution state of each run.

SourceAtlan propertyDescription
run.runIdflowRunIdUnique run identifier for the task
START event timestampflowStartedAtJob/task start time
COMPLETE/ABORT/FAIL event timestampflowFinishedAtJob/task end time
Final event typeflowStatusStatus of the job (COMPLETE, FAIL, ABORT)

See also