Skip to main content

Set up dbt Core

This guide explains how to set up dbt Core in Atlan, including configuring access, organizing your storage bucket, and uploading the necessary metadata files so Atlan can process and analyze your dbt project data.

Setup and access management

In this section, learn how to configure access for dbt Core so Atlan can connect to your storage location and read the required metadata. Choose between using your own cloud storage bucket or an Atlan-managed bucket.

Depending on the cloud provider in use, go to Marketplace → search for dbt → click to set up dbt → select Object Storage, and then choose the desired cloud provider. Atlan supports reading from AWS, Azure, and GCP. The setup process prompts for the information required for each cloud provider. For authentication, refer to the following:

Amazon S3

Please follow the instructions below in order to create the right IAM Role with the right permissions

Azure ADLS

Please follow the instructions below in order to create the right Service principle with the right permissions

Google GCS

Please follow the instructions below in order to create the right Service account with the right permissions

Structure the bucket

Once you have configured access, the next step is to organize your storage bucket so that Atlan can correctly identify and process uploaded files.

info

Atlan uses the metadata.invocation_id and metadata.project_id attributes to uniquely identify and link the uploaded files. Atlan doesn't use the file paths to identify a project or job that the file belongs to. The following directory structure is provided as a guideline

Atlan supports extracting dbt metadata from multiple or single dbt projects. The main-prefix has the following format gcs|s3://<BUCKET_NAME>/<PATH_PREFIX> or abfss://<CONTAINER>/<PATH>, if you used Atlan's bucket, the Atlan support team provides it after setting up access policies on your bucket.

You need to use the following directory structure, even if you have a single dbt project:

main-prefix
- project1
- job1
- manifest.json
- other files
- job2
- manifest.json
- other files
- job4
- manifest.json
- other files
- project3
- job5
- manifest.json
- other files

Upload project files

To load correct metadata, Atlan processes the manifest.json and run_results.json files for each job. There are many ways to load the metadata, below are suggested approaches from Atlan. You need to upload the files from the target directory of the dbt project into distinct folders. Upload the run artifacts generated from the following commands:

  • (Required) Compilation results:
dbt compile --full-refresh

This command generates files that contain a full representation of your dbt project's resources, including models, tests, macros, node configurations, resource properties, and more.

Files to upload: manifest.json and run_results.json

Alternatively, you can upload the same files by running the dbt run --full-refresh command.

  • (Optional) Test results:
dbt test

This command executes all dbt tests in a dbt project and generates files that contain the test results.

Files to upload: manifest.json and run_results.json

  • (Optional) Catalog:
dbt docs generate

This command generates metadata about the tables and views produced by the models in your dbt project, for example, column data types and table statistics.

Files to upload: manifest.json and catalog.json