Crawl dbt
Once you have configured dbt Cloud service token or uploaded your dbt Core project files to cloud storage, you can crawl dbt metadata into Atlan.
To enrich metadata in Atlan from dbt, review the order of operations and then complete the following steps.
Select the source
To select dbt as your source:
- In the top right of any screen, navigate to New and then click New Workflow.
- From the list of packages, select dbt Assets and then click Setup Workflow.
Provide your credentials
dbt core
To enter your dbt Core credentials:
- For Extraction method, click Object Storage.
- Enter the details for the object storage location of your project files.
- Click the Test Authentication button to confirm connectivity to object storage using these details.
- Once authentication is successful, navigate to the bottom of the screen and click Next.
dbt cloud
To enter your dbt Cloud credentials:
- For Extraction method, click Cloud.
- For Host Name, enter the domain name of your dbt Cloud instance, if not the default. Include the
https://
. For more information on access URLs, refer to dbt documentation. - For Authentication Type, Service Account is the default selection for service account token. Change to PAT to enter a personal access token (PAT) instead.
- For Token, enter the dbt Cloud token you generated.
- Click the Test Authentication button to confirm connectivity to dbt Cloud using these details.
- Once authentication is successful, navigate to the bottom of the screen and click Next.
Configure the connection
To complete the dbt connection configuration:
-
Provide a Connection name that represents your source environment. For example, you might use values like
production
,development
,gold
, oranalytics
. -
(Optional) To change the users who are able to manage this connection, change the users or groups listed under Connection Admins.
warningIf you don't specify any user or group, no one can manage the connection - not even admins.
-
Navigate to the bottom of the screen and click Next to proceed.
Configure the crawler
Before running the dbt crawler, you can further configure it.
If a project appears in both the include and exclude filters, the exclude filter takes precedence.
dbt core
On the Configuration page for dbt Core, you can override the defaults for any of these options:
- To limit the enrichment to a particular connection with materialized assets, click Connection and select the relevant option. (This defaults to all connections, if none are specified.)
- To import existing tags from dbt to Atlan, for Import Tags, click Yes.
dbt cloud
On the Configuration page for dbt Cloud, you can override the defaults for any of these options:
- To select the dbt projects and environments you want to exclude from crawling, click Exclude Metadata. (This defaults to no projects, if none are specified.)
- To select the dbt projects and environments you want to include in crawling, click Include Metadata. (This defaults to all projects, if none are specified.)
- To limit the enrichment to a particular connection with materialized assets, click Connection and select the relevant option. (This defaults to all connections, if none are specified.)
- To import existing tags from dbt to Atlan, for Import Tags, click Yes.
- For Advanced options, click Yes to configure the crawler further:
- For Enrich Metadata in Materialized Assets, click Yes to enable enrichment for both dbt and materialized assets or No for dbt assets only.
Run the crawler
To run the dbt crawler, after completing the previous steps:
- To check for any permissions or other configuration issues before running the crawler, click Preflight checks
- You can either:
- To run the crawler once immediately, at the bottom of the screen, click the Run button.
- To schedule the crawler to run hourly, daily, weekly, or monthly, at the bottom of the screen, click the Schedule Run button.
Once the crawler has completed running, you can see the assets on Atlan's asset page! 🎉