Skip to main content

What does Atlan crawl from Apache Flink/OpenLineage?

Once you have integrated Apache Flink/OpenLineage, you can use connector-specific filters for quick asset discovery. The following filters are currently supported:

  • Status filter - last run status for an asset
  • Duration filter - last run duration for an asset

Atlan maps the following assets and properties from Apache Flink/OpenLineage. Asset lineage support depends on the data sources that OpenLineage supports.

Jobs

Atlan represents Flink jobs as generic OpenLineage job assets and uses the job.name and job.namespace from the OpenLineage event to identify each Flink application.

Source fieldAtlan fieldData typeDescription
job.namenameStringName of the Flink job
job.namespaceconnectionQualifiedNameStringMatches the connection name configured in Atlan
-qualifiedNameStringUnique identifier for the job in Atlan
-connectorNameStringName of the connector instance

OpenLineage metadata

Atlan reports OpenLineage operational metadata for Flink jobs.

Source fieldAtlan fieldData typeDescription
run.runIdrunIdStringUnique run identifier emitted by the OpenLineage Flink integration
START event timestampstartTimeDateTimeJob start time
COMPLETE/ABORT/FAIL event timestampendTimeDateTimeJob end time
Final event typerunStateStringStatus of the job (COMPLETE, FAIL, ABORT)

Source and target datasets emitted by the OpenLineage Flink integration are represented as partial assets in Atlan's lineage view when a matching cataloged asset doesn't exist. To enrich these assets, set up the corresponding source connectors (for example, Apache Kafka, Apache Iceberg, or JDBC-based sources) alongside Apache Flink/OpenLineage.

See also