What Atlan crawls from Generic OpenLineage
Atlan extracts lineage and operational metadata from Generic OpenLineage events. This page lists the assets, filters, and field mappings that Atlan supports.
Filters
After integrating Generic OpenLineage, use connector-specific filters in Atlan to quickly find assets. The following filters are supported:
- Status filter: last run status for an asset
- Duration filter: last run duration for an asset
Lineage
Atlan creates two asset types from the inputs and outputs arrays in OpenLineage events:
Process: represents table-level lineage between input and output datasetsColumnProcess: represents column-level lineage between individual fields
For the full list of supported data sources, see What data sources does Generic OpenLineage support for lineage?
Assets
Atlan maps all OpenLineage jobs—whether parent-level or child-level—to the FlowControlOperation asset type. Each asset includes metadata fields extracted from the corresponding OpenLineage event properties.
Flow control operations
Both parent-level and child-level jobs are mapped to FlowControlOperation, with column-level lineage supported at both levels. The tabs below show the field mappings for each level.
- Parent
- Child
Parent-level jobs (APPLICATION, DAG, WORKFLOW) represent top-level orchestration units such as Airflow DAGs, Spark applications, or Alteryx workflows.
| Source property | Atlan property | Description |
|---|---|---|
job.name | name | Name of the parent job |
job.namespace + job.name | qualifiedName | Unique identifier for the job in Atlan (connectionQF/jobName) |
job.facets.jobType.integration | source | Integration source (for example, airflow, spark, alteryx) |
job.facets.jobType.jobType | assetUserDefinedType | Type of the job (for example, Application, Dag, Workflow) |
run.facets.parent.job.name | flowControlledBy | Reference to the controlling FlowControlOperation from another pipeline (for example, Airflow task controlling a Spark app) |
run.facets.airflow.task.owner | sourceOwners | Owner information extracted from the Airflow task facet (Airflow-specific) |
run.facets.airflow.task.owner | ownerUsers | Validated owner usernames extracted from the Airflow task facet (Airflow-specific) |
run.facets.airflow.task.description | description | Description of the job extracted from the Airflow task facet (Airflow-specific) |
run.facets.airflow.task.tags | tags | Tags associated with the job extracted from the Airflow task facet (Airflow-specific) |
job.namespace | connectionName | Name of the connection, matched from job namespace |
job.namespace | connectionQualifiedName | Unique identifier for the connector instance, matched from job namespace |
| URL path | connectorName | Name of the connector (generic-openlineage) |
Child-level jobs (JOB, TASK) represent individual tasks within a parent job, such as Airflow tasks or individual Spark job stages.
| Source property | Atlan property | Description |
|---|---|---|
job.name (portion after parent name) | name | Name of the child task |
job.namespace + job.name | qualifiedName | Unique identifier for the task in Atlan (connectionQF/parentName/childName) |
job.facets.jobType.integration | source | Integration source (for example, airflow, spark, alteryx) |
job.facets.jobType.jobType | assetUserDefinedType | Type of the job (for example, Job, Task) |
run.facets.parent.run.runId | flowControlledBy | Reference to the parent FlowControlOperation |
run.facets.airflow.task.upstream_task_ids | flowPredecessors | References to predecessor FlowControlOperations within the same parent (Airflow-specific) |
run.facets.airflow.task.owner | sourceOwners | Owner information extracted from the Airflow task facet (Airflow-specific) |
job.namespace | connectionName | Name of the connection, matched from job namespace |
job.namespace | connectionQualifiedName | Unique identifier for the connector instance, matched from job namespace |
| URL path | connectorName | Name of the connector (generic-openlineage) |
OpenLineage run metadata
Atlan captures run-level metadata for every OpenLineage event and maps it to the corresponding flow control operation. This metadata is available for both parent and child jobs and reflects the actual execution state of each run.
| Source | Atlan property | Description |
|---|---|---|
run.runId | flowRunId | Unique run identifier for the task |
START event timestamp | flowStartedAt | Job/task start time |
COMPLETE/ABORT/FAIL event timestamp | flowFinishedAt | Job/task end time |
| Final event type | flowStatus | Status of the job (COMPLETE, FAIL, ABORT) |
See also
- Integrate Generic OpenLineage: Set up the connector and configure event ingestion.
- How does hierarchical vs. non-hierarchical mode affect lineage creation?
- How does cross-system lineage work?