Skip to main content

Cross-workspace extraction

This FAQ addresses common questions about setting up and configuring cross-workspace extraction for Databricks. Cross-workspace extraction enables you to use a single service principal to crawl metadata from all workspaces within a Databricks metastore, eliminating the need for separate connections.

Why do you need cross-workspace extraction?

If a user has multiple workspaces under the same metastore in their Databricks environment, this feature eliminates the need to set up separate Databricks connections for each workspace. Instead, a single connection can extract metadata across all available workspaces present in a metastore.

What are public and private catalogs?

  • Public catalogs are available from all workspaces within a metastore.
  • Private catalogs are restricted to specific workspaces and aren't available across the entire metastore.

If you have workspaces in different metastores, can one cross-workspace setup handle all of them, or do you need separate configurations?

One cross-workspace setup extracts metadata only from the workspaces within a single metastore. The metastore used for extraction is determined by the metastore of the originally configured workspace used while setting up the Databricks crawler.

What happens if you add new workspaces to your metastore? Are they automatically included in the extraction?

Yes, provided that the common service principal has the necessary permissions on the newly added workspace.