What does Atlan crawl from Apache Airflow/OpenLineage?
Once you have integrated Apache Airflow/OpenLineage, you can use connector-specific filters for quick asset discovery. The following filters are currently supported:
- Status filter - last run status for an asset
- Duration filter - last run duration for an asset
Atlan maps the following assets and properties from Apache Airflow/OpenLineage. Asset lineage support depends on the list of operators supported by OpenLineage.
DAGs
Atlan maps DAGs (directed acyclic graphs) from Apache Airflow/OpenLineage to its AirflowDAG
asset type.
Source property | Atlan property | Description |
---|---|---|
job.name | name | Name of the Airflow DAG |
- | qualifiedName | Unique identifier for the DAG in Atlan |
description | description | Description of the DAG from Airflow |
owners | sourceOwners | Original owner information from Airflow |
- | ownerUsers | Validated Atlan usernames (mapped from source owners) |
schedule_interval | airflowDagSchedule | DAG's schedule interval (cron expression or preset) |
delta | airflowDagScheduleDelta | Schedule interval in seconds |
tags | airflowTags | Tags assigned to the DAG |
run_id | airflowRunName | Unique identifier for the DAG run |
run_type | airflowRunType | Type of run (scheduled, manual, backfill) |
eventTime (start) | airflowRunStartTime | Timestamp when the DAG run started |
eventTime (end) | airflowRunEndTime | Timestamp when the DAG run completed |
eventType | airflowRunOpenLineageState | Final status of the DAG run |
version | airflowRunVersion | Airflow version |
openlineageAdapterVersion | airflowRunOpenLineageVersion | OpenLineage adapter version |
- | sourceURL | Direct link to the DAG in Airflow UI |
- | connectionName | Name of the connector instance |
- | connectionQualifiedName | Unique identifier for the connector instance |
- | connectorName | Name of the connector type |
Did you know?
If a DAG has more than 10 valid owner email addresses (comma-separated), only the first 10 will be captured and published.
Tasks
Atlan maps tasks from Apache Airflow/OpenLineage to its AirflowTask
asset type.
Source property | Atlan property | Description |
---|---|---|
job.name (partial) | name | Name of the task (extracted from full job name) |
- | qualifiedName | Unique identifier for the task in Atlan |
- | airflowDagName | Name of the parent DAG |
- | airflowDagQualifiedName | Unique identifier for the parent DAG in Atlan |
operator_class | airflowTaskOperatorClass | Type of operator used for the task |
conn_id | airflowTaskConnectionId | Connection ID used by the task |
sql | airflowTaskSql | SQL query (for SQL-based operators) |
owner | sourceOwners | Owner information from the task definition |
eventTime (start) | airflowRunStartTime | Timestamp when the task started |
eventTime (end) | airflowRunEndTime | Timestamp when the task completed |
eventType | airflowRunOpenLineageState | Final status of the task run |
run_id | airflowRunName | Unique identifier for the task run |
run_type | airflowRunType | Type of run (from parent DAG) |
pool | airflowTaskPool | Worker pool assigned to the task |
pool_slots | airflowTaskPoolSlots | Number of pool slots used by the task |
priority_weight | airflowTaskPriorityWeight | Priority weight for execution order |
queue |