Set up cross-workspace extraction
Eliminate the need for separate crawler configurations by using a single service principal to crawl metadata from all workspaces within a Databricks metastore. This guide walks you through configuring the necessary permissions to enable cross-workspace extraction.
Cross-workspace extraction isn't supported for REST API or JDBC extraction methods.
Prerequisites
Before you begin, make sure you have:
- A Unity Catalog-enabled Databricks workspace
- Account admin access to create and manage service principals
- Workspace admin access to grant permissions across all target workspaces
- At least one active SQL warehouse in each workspace you intend to crawl
- Set up Databricks authentication completed with one of the supported authentication methods
- System table extraction enabled for lineage and usage extraction
Permissions required
The service principal needs the following permissions to enable cross-workspace extraction:
CAN_USE
on SQL warehouses in each workspaceSELECT
onsystem.access.workspace_latest
tableUSE CATALOG
,BROWSE
, andSELECT
on all catalogs you want to crawl
Add service principal to all workspaces
You must use a single, common service principal that has been granted access to all Databricks workspaces you intend to crawl within the metastore.
- Log in to your Databricks account console as an account admin
- From the left menu, click Workspaces and select a workspace
- From the tabs along the top, click the Permissions tab
- In the upper right, click Add permissions
- In the Add permissions dialog:
- For User, group, or service principal, select your service principal
- For Permission, select workspace User
- Click Add
- Repeat steps 2-5 for each workspace you intend to crawl
Grant permissions
Configure the necessary permissions for the service principal to access and extract metadata from all workspaces within the metastore.
-
SQL workspace permissions: The service principal must have usage permissions on at least one active SQL warehouse within each workspace. The extractor uses the smallest available warehouse to run its discovery queries.
- Via SQL
- Via UI
-
Connect to your Databricks workspace using a SQL client or the SQL editor
-
Run the following command for each workspace, replacing the placeholders:
GRANT CAN_USE ON WAREHOUSE <warehouse_name> TO `<service_principal_id>`;
- Replace
<warehouse_name>
with your actual warehouse name - Replace
<service_principal_id>
with your service principal's application ID
Example
GRANT CAN_USE ON WAREHOUSE production-warehouse TO `12345678-1234-1234-1234-123456789012`;
- Replace
- Log in to your Databricks workspace as a workspace admin
- From the left menu, click SQL Warehouses
- On the Compute page, for each SQL warehouse, click the 3-dot icon and then click Permissions
- In the Manage permissions dialog:
- In the Type to add multiple users or groups field, search for and select your service principal
- Select Can use permission
- Click Add to assign the permission
-
System table permissions: Access to the system schema is essential for workspace and lineage discovery.
- Via SQL
- Via UI
-
Connect to your Databricks workspace using a SQL client or the SQL editor
-
Run the following command, replacing the placeholder:
GRANT SELECT ON TABLE system.access.workspace_latest TO `<service_principal_id>`;
- Replace
<service_principal_id>
with your service principal's application ID
Example
GRANT SELECT ON TABLE system.access.workspace_latest TO `12345678-1234-1234-1234-123456789012`;
- Replace
- Log in to your Databricks workspace as a workspace admin
- From the left menu, click Catalog
- In the Catalog Explorer, navigate to system > access
- Click on the workspace_latest table
- Click the Permissions tab and then click Grant
- In the Grant permissions dialog:
- Under Principals, select your service principal
- Under Privileges, check SELECT
- Click Grant to apply the permissions
-
Asset permissions: The service principal requires permissions to "see" and "read" the metadata for all data assets you wish to extract. These grants must be applied to all private, public, and shared catalogs that are in scope for crawling.
Important!For private catalogs, grant permissions from each workspace. For public catalogs, grant from any workspace.
- Via SQL
- Via UI
-
Connect to your Databricks workspace using a SQL client or the SQL editor
-
Grant catalog-level permissions (required even when using BROWSE - BROWSE automatically grants access to all schemas and tables):
GRANT USE CATALOG ON CATALOG <catalog_name> TO `<service_principal_id>`;
GRANT BROWSE ON CATALOG <catalog_name> TO `<service_principal_id>`;- Replace
<catalog_name>
with your actual catalog name - Replace
<service_principal_id>
with your service principal's application ID
- Replace
-
If not using BROWSE, along with catalog permissions, grant additional permissions:
-
Grant schema-level permissions:
GRANT USE SCHEMA ON SCHEMA <catalog_name>.<schema_name> TO `<service_principal_id>`;
- Replace
<catalog_name>
and<schema_name>
with your actual values - Replace
<service_principal_id>
with your service principal's application ID
- Replace
-
Grant table-level permissions:
GRANT SELECT ON ALL TABLES IN SCHEMA <catalog_name>.<schema_name> TO `<service_principal_id>`;
- Replace
<catalog_name>
and<schema_name>
with your actual values - Replace
<service_principal_id>
with your service principal's application ID
- Replace
-
Example
GRANT USE CATALOG ON CATALOG main TO `12345678-1234-1234-1234-123456789012`;
GRANT BROWSE ON CATALOG main TO `12345678-1234-1234-1234-123456789012`;- Log in to your Databricks workspace as a workspace admin
- From the left menu, click Catalog
- In the Catalog Explorer, navigate to the catalog you want to grant permissions on (for example,
main
) - Click the Permissions tab and then click Grant
- In the Grant permissions dialog:
- Under Principals, select your service principal
- Under Privileges, check the following permissions:
- USE CATALOG
- USE SCHEMA
- BROWSE
- SELECT
- Click Grant to apply the permissions
- Repeat steps 3-5 for each catalog you want to crawl in Atlan
Need help?
- Check Troubleshooting Databricks connectivity for common issues
- Contact Atlan support for help with setup or integration