Skip to main content

Databricks assets package

The Databricks assets package crawls databricks assets and publishes them to Atlan for discovery.

Direct extraction

Will create a new connection

This should only be used to create the workflow the first time. Each time you run this method it will create a new connection and new assets within that connection — which could lead to duplicate assets if you run the workflow this way multiple times with the same settings.

Instead, when you want to re-crawl assets, re-run the existing workflow (see Re-run existing workflow below).

To crawl assets directly from databricks:

Coming soon

Extraction method: System tables

Will create a new connection

This should only be used to create the workflow the first time. Each time you run this method it will create a new connection and new assets within that connection — which could lead to duplicate assets if you run the workflow this way multiple times with the same settings.

Instead, when you want to re-crawl assets, re-run the existing workflow (see Re-run existing workflow below).

To crawl assets directly from databricks using system tables extraction method:

Coming soon

Offline extraction

Will create a new connection

This should only be used to create the workflow the first time. Each time you run this method it will create a new connection and new assets within that connection — which could lead to duplicate assets if you run the workflow this way multiple times with the same settings.

Instead, when you want to re-crawl assets, re-run the existing workflow (see Re-run existing workflow below).

To crawl databricks assets from the S3 bucket:

Coming soon

Re-run existing workflow

To re-run an existing workflow for databricks assets:

Coming soon
Was this page helpful?