Crawl on-premises Databricks
Offline extraction sunset
The Docker-based databricks-extractor offline tool has been sunset and is no longer available. The sudo docker-compose up runbook and the Offline extraction method on the Databricks crawler are no longer supported. For on-premises or network-restricted Databricks environments, use one of the supported approaches below.
Supported approaches
For Databricks instances that you can't expose to Atlan directly, use one of the following supported approaches to crawl metadata:
- Agent extraction with Self-Deployed Runtime - Atlan's agent executes metadata extraction within your own environment. See Self-Deployed Runtime and the Agent extraction method section of the Databricks crawler guide.
- Secure Agent - Fetch metadata from Databricks through Atlan's Secure Agent, running inside your network. See How to configure Secure Agent for workflow execution.
- Direct connectivity via private link - Expose Databricks to Atlan over a private link:
Once connectivity is in place, follow the standard Crawl Databricks guide.
If you have an existing setup that relies on the sunset offline extractor, contact your Atlan Account team to plan your migration.