Crawl Iceberg

Configure and run the crawler to extract metadata from your Iceberg data lakehouse assets.

Atlan crawls metadata from Iceberg catalogs, namespaces, tables, and columns.

Prerequisites

Before you begin, make sure you have:

Completed Set up Iceberg
Admin access to your Atlan instance
The connection values for your chosen mode (Generic REST Catalog or BigLake Metastore)

Create crawler workflow

To crawl metadata from Iceberg, review the order of operations and then complete the following steps.

In the top right of any screen, navigate to New and then click New Workflow.
From the list of packages, select Iceberg Assets and click Setup Workflow.

Configure authentication

Choose one authentication mode and then configure either Direct extraction or Agent extraction for that mode.

Generic REST Catalog
BigLake Metastore (GCP)

Use this mode for REST catalogs that support OAuth2 client credentials.

Direct extraction

Extraction method: Select Direct.
Authentication method: Select Token.
Enter the required values:
- REST Catalog URI: For example, https://your-catalog.com/api/rest
- Token: Enter credentials in the format client-id:client-secret
- Catalog Name
- Warehouse
- Scope (if required by your catalog)
Click Test Connection.
Once successful, click Next.

Agent extraction

Extraction method: Select Agent.
Provide the same values as Direct extraction through your configured secret store.
Complete runtime configuration by following How to configure Secure Agent for workflow execution.
Click Next.

Use this mode for Iceberg catalogs backed by Google BigLake Metastore.

Service account key
Workload Identity Federation (WIF)

Direct extraction

Extraction method: Select Direct.
Authentication method: Select BigLake Metastore (BLM).
GCP authentication type: Select Service account key.
Enter the required values:
- REST Catalog URI
- Project ID
- Location
- Catalog Name
- Warehouse (use your configured warehouse path, for example, gs://<bucket>/warehouse)
- Service account JSON key
Click Test Connection.
Once successful, click Next.

Agent extraction

Extraction method: Select Agent.
Select BigLake Metastore (BLM) and Service account key.
Provide the same values as Direct extraction through your configured secret store.
Complete runtime configuration by following How to configure Secure Agent for workflow execution.
Click Next.

Direct extraction

Extraction method: Select Direct.
Authentication method: Select BigLake Metastore (BLM).
GCP authentication type: Select Workload Identity Federation (WIF).
Enter the required values:
- REST Catalog URI
- Project ID
- Location
- Catalog Name
- Warehouse (use your configured warehouse path, for example, gs://<bucket>/warehouse)
- Service Account Email
- WIF Pool Provider ID
- Atlan OAuth Client ID
- Atlan OAuth Client Secret
Click Test Connection.
Once successful, click Next.

Agent extraction

Extraction method: Select Agent.
Select BigLake Metastore (BLM) and Workload Identity Federation (WIF).
Provide the same values as Direct extraction through your configured secret store.
Complete runtime configuration by following How to configure Secure Agent for workflow execution.
Click Next.

Configure connection

On this page, define how this Iceberg connection is identified and managed in Atlan.

Provide a Connection Name that represents your source environment (for example, production, development, or iceberg-blm).
To control who can manage this connection, configure Connection Admins.
Click Next.

Configure crawler

Before running the crawler, optionally customize crawl scope on the Metadata page:

Exclude Metadata: Select specific namespaces and tables to skip.
Include Metadata: Select specific namespaces and tables to include.
Preflight checks: Validate connectivity and permissions before execution.

Run crawler

After configuration, choose how to run:

Click Run to run once immediately.
Click Schedule & Run to run on a schedule.

Verify crawled assets

After the crawler completes:

Navigate to Workflows and open the Iceberg workflow run.
Review execution details and logs.
Confirm status is Success.

Then verify crawled assets from Iceberg in Atlan search and asset views.

Prerequisites​

Create crawler workflow​

Configure authentication​

Configure connection​

Configure crawler​

Run crawler​

Verify crawled assets​

See also​

Prerequisites

Create crawler workflow

Configure authentication

Configure connection

Configure crawler

Run crawler

Verify crawled assets

See also