Skip to main content

Integrate Astronomer/OpenLineage

To integrate Astronomer/OpenLineage with Atlan, configure a connection in Atlan, set environment variables in Astronomer to route OpenLineage events, and verify connectivity by running a preflight check DAG. Once configured, Atlan processes lineage events from your Airflow DAGs and catalogs your pipeline assets automatically. To learn more about OpenLineage, refer to OpenLineage configuration and facets.

Before you begin

Before you start, make sure you have one of the following set up in Atlan:

You also need the connection name you create in the next step when configuring Astronomer.

Configure workflow in Atlan

To select Astronomer/OpenLineage as your source, from within Atlan:

  1. In the top right of any screen, click New and then click New workflow.
  2. From the filters along the top, click Orchestrator.
  3. From the list of packages, select Astronomer Airflow Assets and then click Setup Workflow.

Create connection

warning

A single connection (namespace) must be used for only one Airflow instance. Using the same connection across multiple instances may cause environment variables to update incorrectly, leading to unexpected behavior.

You only need to create a connection once to enable Atlan to receive incoming OpenLineage events. Once you have set up the connection, you neither have to rerun the workflow nor schedule it. Atlan processes the OpenLineage events as your DAGs run to catalog your Apache Airflow assets.

To configure the Astronomer/OpenLineage connection, from within Atlan:

  1. For Connection Name, provide a connection name that represents your source environment. For example, you might use values like production,development,gold, or analytics.
  2. To change the users who are able to manage this connection, change the users or groups listed under Connection Admins. If you don't specify any user or group, no one can manage the connection, not even admins.
  3. To let Atlan display assets directly in Astronomer from the asset profile, enter the URL of your Astronomer Airflow UI in the Host field.
  4. If your Astronomer Airflow UI uses a non-default port, enter the port number in the Port field.
  5. For Enable OpenLineage Events, click Yes to enable the processing of OpenLineage events or click No to disable it. If disabled, new events won't be processed in Atlan.
  6. To create a connection, at the bottom of the screen, click Create connection.

Configure integration in Astronomer

warning

Atlan doesn't support integrating with Apache Airflow versions older than 2.5.0.

Astronomer has a built-in OpenLineage integration. Atlan recommends using OpenLineage version 1.2.1 or later. For more information, see the OpenLineage Python client documentation. You need to use environment variables in Astronomer to set custom values for the integration with Atlan.

To configure Astronomer to send OpenLineage events to Atlan:

  1. Open your Astronomer console and select a workspace.
  2. In the left menu under Workspace, click Deployments and then select the required deployment.
  3. On your deployment page, click the Variables tab.
  4. On the Variables page, click the Edit variables button.
  5. Add the following environment variable keys and corresponding values:
    • For Apache Airflow versions 2.7.0 onward:

      • AIRFLOW__OPENLINEAGE__NAMESPACE: set the connection name as exactly configured in Atlan.
      • OPENLINEAGE_DISABLED and AIRFLOW__OPENLINEAGE__DISABLED: set both to false to enable the OpenLineage listener in Apache Airflow, if disabled by default.
      • OPENLINEAGE__TRANSPORT__TYPE: set to composite. This sends OpenLineage events to multiple backends simultaneously.
      • OPENLINEAGE__TRANSPORT__CONTINUE_ON_FAILURE: set to true so events continue routing to other backends if one fails.
      • OPENLINEAGE__TRANSPORT__TRANSPORTS__ATLAN__TYPE: set to transform.
      • OPENLINEAGE__TRANSPORT__TRANSPORTS__ATLAN__TRANSFORMER_CLASS: set to openlineage.client.transport.transform.JobNamespaceReplaceTransformer.
      • OPENLINEAGE__TRANSPORT__TRANSPORTS__ATLAN__TRANSFORMER_PROPERTIES: set to {"new_job_namespace": "<connection-name>"}. Replace <connection-name> with the connection name as exactly configured in Atlan.
      • OPENLINEAGE__TRANSPORT__TRANSPORTS__ATLAN__TRANSPORT__TYPE: set to http.
      • OPENLINEAGE__TRANSPORT__TRANSPORTS__ATLAN__TRANSPORT__URL: set to https://<instance>.atlan.com/events/openlineage/airflow-astronomer/. Replace <instance> with the name of your Atlan instance.
      • OPENLINEAGE__TRANSPORT__TRANSPORTS__ATLAN__TRANSPORT__AUTH: set to {"type":"api_key", "api_key":"<API_token>"}. Replace <API_token> with the API token generated in Atlan.
      Understanding the transport architecture

      This configuration uses a composite transport that sends OpenLineage events to multiple backends at once. Within that composite, an Atlan-specific transform transport wraps an HTTP transport that sends events to Atlan. This layered structure lets Astronomer continue sending events to its own backend while also routing them to Atlan.

    • For Apache Airflow versions 2.5.0 onward and prior to 2.7.0:

      • OPENLINEAGE_URL: points to the service that consumes OpenLineage events - for example, https://<instance>.atlan.com/events/openlineage/airflow-astronomer/.
      • OPENLINEAGE_API_KEY: set the API token generated in Atlan.
      • OPENLINEAGE_NAMESPACE: set the connection name as exactly configured in Atlan.
      • OPENLINEAGE_DISABLED and AIRFLOW__OPENLINEAGE__DISABLED: set both to false to enable the OpenLineage listener in Apache Airflow, if disabled by default. 6. Click Update Environment Variables to save your changes. New variables can take up to two minutes to take effect in your deployment.

Verify connection

To verify connectivity between Astronomer and Atlan:

  1. For Verify connection with Astronomer, click the clipboard icon to copy and run the preflight check DAG on your Astronomer instance to test connectivity with Atlan. If you encounter any errors after running the DAG, refer to the preflight checks documentation.
  2. Click Done to complete setup.

Once your DAGs have completed running in Apache Airflow, you see Apache Airflow DAGs and tasks along with lineage from OpenLineage events in Atlan.

You can also view event logs in Atlan to track and debug events received from OpenLineage.

Next steps