Skip to main content

Set up dbt Core

This guide explains how to set up dbt Core in Atlan, including configuring access, organizing your storage bucket, and uploading the necessary metadata files so Atlan can process and analyze your dbt project data.

Setup and access management

In this section, learn how to configure access for dbt Core so Atlan can connect to your storage location and read the required metadata. Choose between using your own cloud storage bucket or an Atlan-managed bucket.

Use this option if you store dbt artifacts in your own cloud storage bucket. You create a dedicated read credential for Atlan, then configure the connector in Atlan with your bucket details.

Step 1: Obtain Atlan's dbt service identity ARN

Contact Atlan support to request the Atlan dbt service identity ARN. You need this value to configure the trust relationship in Step 3.

Step 2: Create IAM policy

  1. In your AWS account, go to IAM → Policies → Create policy.
  2. Select the JSON tab and paste the following, replacing <your-bucket> and <your-prefix> with your actual values:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AtlanDbtReadAccess",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::<your-bucket>",
"arn:aws:s3:::<your-bucket>/<your-prefix>/*"
]
}
]
}
  1. Name the policy (for example, AtlanDbtCoreReadPolicy) and create it.

Step 3: Create IAM role with trust policy

  1. In AWS, go to IAM → Roles → Create role.
  2. Select Trusted entity type: AWS account → Another AWS account and enter the account ID from the Atlan dbt service identity ARN.
  3. Attach the policy you created in Step 2.
  4. Name the role (for example, AtlanDbtCoreRole) and create it.
  5. Open the new role and click Edit trust policy. Replace the policy with:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "<atlan-dbt-service-identity-arn>"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "<your-external-id>"
}
}
}
]
}

Replace <atlan-dbt-service-identity-arn> with the ARN from Step 1, and <your-external-id> with a unique string of your choice (for example, atlan-dbt-external-id). Note this external ID—you enter it in Atlan in the next step.

  1. Copy the Role ARN from the role summary page (format: arn:aws:iam::123456789012:role/AtlanDbtCoreRole).

Step 4: Configure connector in Atlan

  1. Go to Marketplace → search for dbt → click to set up dbt.
  2. For Source, select Core.
  3. For Manifest Source, select External Object Storage.
  4. For Storage Provider, select AWS. Set Authentication to IAM Role.
  5. Enter:
    • AWS Role ARN: the Role ARN from Step 3
    • Bucket Name: your S3 bucket name
    • Prefix: the path within the bucket where dbt artifacts are stored
    • Region: your bucket's AWS region
  6. Click Test Authentication to verify, then proceed to configure the crawler.

Structure the bucket

Once you have configured access, the next step is to organize your storage bucket so that Atlan can correctly identify and process uploaded files.

info

Atlan uses the metadata.invocation_id and metadata.project_id attributes to uniquely identify and link the uploaded files. Atlan doesn't use the file paths to identify a project or job that the file belongs to. The following directory structure is provided as a guideline.

Atlan supports extracting dbt metadata from multiple or single dbt projects. The main-prefix has the following format gcs|s3://<BUCKET_NAME>/<PATH_PREFIX> or abfss://<CONTAINER>/<PATH>, if you used Atlan's bucket, the Atlan support team provides it after setting up access policies on your bucket.

note

The <PATH_PREFIX> (or <PATH> for Azure) is optional. If your dbt project directories live at the bucket or container root, leave the Prefix field empty when you configure the crawler and place your project folders directly under the bucket or container.

You need to use the following directory structure, even if you have a single dbt project:

main-prefix
- project1
- job1
- manifest.json
- other files
- job2
- manifest.json
- other files
- job4
- manifest.json
- other files
- project3
- job5
- manifest.json
- other files

Upload project files

To load correct metadata, Atlan processes the manifest.json and run_results.json files for each job. There are many ways to load the metadata, below are suggested approaches from Atlan. You need to upload the files from the target directory of the dbt project into distinct folders. Upload the run artifacts generated from the following commands:

  • (Required) Compilation results:
dbt compile --full-refresh

This command generates files that contain a full representation of your dbt project's resources, including models, tests, macros, node configurations, resource properties, and more.

Files to upload: manifest.json and run_results.json

Alternatively, you can upload the same files by running the dbt run --full-refresh command.

  • (Optional) Test results:
dbt test

This command executes all dbt tests in a dbt project and generates files that contain the test results.

Files to upload: manifest.json and run_results.json

  • (Optional) Catalog:
dbt docs generate

This command generates metadata about the tables and views produced by the models in your dbt project, for example, column data types and table statistics.

Files to upload: manifest.json and catalog.json