What does Atlan crawl from Apache Flink/OpenLineage?
Once you have integrated Apache Flink/OpenLineage, you can use connector-specific filters for quick asset discovery. The following filters are currently supported:
- Status filter - last run status for an asset
- Duration filter - last run duration for an asset
Atlan maps the following assets and properties from Apache Flink/OpenLineage. Asset lineage support depends on the data sources that OpenLineage supports.
Jobs
Atlan represents Flink jobs as generic OpenLineage job assets and uses the job.name and job.namespace from the OpenLineage event to identify each Flink application.
| Source field | Atlan field | Data type | Description |
|---|---|---|---|
job.name | name | String | Name of the Flink job |
job.namespace | connectionQualifiedName | String | Matches the connection name configured in Atlan |
| - | qualifiedName | String | Unique identifier for the job in Atlan |
| - | connectorName | String | Name of the connector instance |
OpenLineage metadata
Atlan reports OpenLineage operational metadata for Flink jobs.
| Source field | Atlan field | Data type | Description |
|---|---|---|---|
run.runId | runId | String | Unique run identifier emitted by the OpenLineage Flink integration |
START event timestamp | startTime | DateTime | Job start time |
COMPLETE/ABORT/FAIL event timestamp | endTime | DateTime | Job end time |
| Final event type | runState | String | Status of the job (COMPLETE, FAIL, ABORT) |
Source and target datasets emitted by the OpenLineage Flink integration are represented as partial assets in Atlan's lineage view when a matching cataloged asset doesn't exist. To enrich these assets, set up the corresponding source connectors (for example, Apache Kafka, Apache Iceberg, or JDBC-based sources) alongside Apache Flink/OpenLineage.
See also
- Integrate Apache Flink/OpenLineage: Step-by-step guide to set up the Apache Flink/OpenLineage connector in Atlan.