Set up Iceberg

Configure your Iceberg catalog to enable Atlan to connect and crawl your data lakehouse assets.

Atlan supports two setup modes for Iceberg:

Generic REST Catalog using OAuth2 client credentials
BigLake Metastore (BLM) on Google Cloud using either service account key auth or Workload Identity Federation (WIF)

Prerequisites

Before you begin, make sure you have:

Permission to create and assign IAM roles in your environment
Network connectivity from Atlan (or Self-Deployed Runtime) to your catalog endpoint

Use this mode for REST catalogs that support OAuth2 client credentials.

Request REST catalog credentials from your catalog administrator.
Gather the following values:
- REST Catalog URI (for example, https://your-catalog.com/api/rest)
- Client ID
- Client Secret
- Catalog Name
- Warehouse
- Scope (if required by your catalog)
When creating the crawler in Atlan, select Authentication method = Token and enter credentials in the format client-id:client-secret.

Use this mode when your Iceberg REST catalog is backed by Google BigLake Metastore.

Create a custom IAM role with the following permissions, then assign it to the service account used by Atlan:

biglake.catalogs.get: Retrieves catalog metadata. This is metadata access only and doesn't grant table data access.
biglake.databases.get: Retrieves namespace/database metadata. This is metadata access only and doesn't grant table data access.
biglake.databases.list: Lists namespaces/databases in the catalog for discovery.
biglake.tables.get: Retrieves table metadata. This is metadata access only and doesn't grant table data access.
biglake.tables.list: Lists tables in a namespace for discovery.
biglake.catalogs.use: Enables use of the catalog resource during metadata API calls.
biglake.databases.use: Enables use of namespace/database resources during metadata API calls.
biglake.tables.readMetadata: Reads table metadata details (schema, partitions, snapshots) without reading table data.

A Google Cloud service account is required for both authentication options below.

Use this option when you want key-based authentication.

Use this option to avoid long-lived service account keys.

Create an OAuth client in Atlan and securely store:
- OAuth Client ID
- OAuth Client Secret
In Google Cloud, create a Workload Identity Pool and OIDC provider that trusts your Atlan tenant issuer.
Configure attribute mapping for audience and add your Atlan OAuth client ID as the audience.
Grant roles/iam.workloadIdentityUser on the target service account to the workload identity principal set.
Copy the WIF provider resource name in this format:
- //iam.googleapis.com/projects/<project-number>/locations/global/workloadIdentityPools/<pool-id>/providers/<provider-id>
Use these values when configuring the crawler:
- Project ID
- Location
- Catalog Name
- Warehouse
- Service Account Email
- WIF Pool Provider ID
- Atlan OAuth Client ID
- Atlan OAuth Client Secret

For detailed WIF setup flow, refer to Set up Workload Identity Federation for Google BigQuery. The same Atlan OAuth and Google WIF concepts apply.

Before crawling, confirm Atlan can reach your Iceberg catalog:

HTTPS access: Your REST catalog endpoint must be available via HTTPS.
Firewall rules: Permit outbound connections from Atlan (or Self-Deployed Runtime) to your catalog endpoint.
DNS resolution: Your catalog hostname must be resolvable from the runtime.

Crawl Iceberg assets: Configure and run the crawler to extract metadata from Iceberg.