Skip to main content

What does Atlan crawl from Apache Spark/OpenLineage?

Once you have integrated Apache Spark/OpenLineage, you can use connector-specific filters for quick asset discovery. The following filters are currently supported:

  • Status filter - last run status for an asset
  • Duration filter - last run duration for an asset

Atlan maps the following assets and properties from Apache Spark/OpenLineage. Asset lineage support depends on the data sources that OpenLineage supports.

Jobs

Atlan maps jobs from Apache Spark to its SparkJob asset type. Atlan also supports column-level lineage for Spark jobs.

Source propertyAtlan property
appNamesparkAppName
mastersparkMaster

OpenLineage metadata

Atlan reports OpenLineage operational metadata for Spark jobs.

Atlan propertyDescription
sparkRunVersionSpark runtime version
sparkRunOpenLineageVersionOpenLineage library version
sparkRunStartTimejob start time
sparkRunEndTimejob end time
sparkRunOpenLineageStatestatus of the job