Skip to main content

Crawl dbt

Configure and run the dbt crawler to extract metadata from your dbt Cloud or dbt Core projects and enrich your assets with dbt model information, lineage, and documentation.

Prerequisites

Before you begin, make sure you have:

Create crawler workflow

Follow these steps to create a workflow in Atlan that captures metadata from dbt.

  1. In Atlan, select New > New Workflow.
  2. From the package list, choose dbt Assets.
  3. Select Setup Workflow.

Configure authentication

Choose your dbt source and provide the required credentials.

  1. For Extraction method, click Cloud.
  2. For Host Name, enter the domain name of your dbt Cloud instance, if not the default. Include the https://. For example:
    https://cloud.getdbt.com
    For more information on access URLs, refer to dbt documentation.
  3. For Authentication Type, select Service Account or PAT depending on your token type.
  4. Enter your dbt Cloud token in the Token field.
  5. Click Test Authentication to verify the connection.

Configure connection

To complete the dbt connection configuration:

  1. Provide a Connection Name that represents your source environment. For example, you might use values like analytics, production, or development.

  2. (Optional) To change the users able to manage this connection, change the users or groups listed under Connection Admins.

    warning

    If you don't specify any user or group, nobody can manage the connection - not even admins.

  3. At the bottom of the screen, click Next to proceed.

Configure dbt settings

The configuration options change based on the Extraction method you selected earlier, Cloud or Core (object storage). Follow this step to fine-tune how dbt metadata is enriched in Atlan.

  1. Under Exclude Metadata, choose projects or environments you don't want to include in enrichment. Leave blank if you want all available projects.
  2. Under Include Metadata, select specific projects or environments to include.
  3. To limit the enrichment to a particular connection with materialized assets, click Connection and select the relevant option. (This defaults to all connections, if none are specified.)
  4. For Import Tags, click Yes to sync dbt tags from your Cloud workspace into Atlan.
  5. For Enrich Metadata in Materialized Assets, click Yes to enable enrichment for both dbt and materialized assets.

Run crawler

To run the dbt crawler, after completing the previous steps:

  1. To check for any permissions or other configuration issues before running the crawler, click Preflight checks.
  2. You can either:
    • To run the crawler once immediately, at the bottom of the screen, click the Run button.
    • To schedule the crawler to run hourly, daily, weekly, or monthly, at the bottom of the screen, click the Schedule Run button.

Once the crawler has completed running, you can see the assets in Atlan's asset page! 🎉

See also