Mine ClickHouse
Use the ClickHouse Miner workflow to extract query history from system.query_log and build lineage for your crawled ClickHouse assets. This page walks you through creating, configuring, and running the miner.
Prerequisites
Before you begin:
- Make sure you have crawled ClickHouse at least once so a connection exists to mine.
- Make sure query logging is enabled on your ClickHouse instance. The miner extracts query history from the
system.query_logtable and can't run without it.
Create miner workflow
To create the ClickHouse miner workflow:
- In the top right of any screen, navigate to +New and then click New workflow.
- Under Marketplace, from the filters along the top, click Miner.
- From the list of packages, select ClickHouse Miner and then click Setup Workflow.
Configure miner
The miner restricts you to the past two weeks of query history. If you need more history (for example, during an initial load), use the S3 miner first, then switch to query history extraction afterward.
To configure the ClickHouse miner:
- For Connection, select the connection to mine. (To select a connection, the crawler must have already run.)
- For Miner Extraction Method, choose your extraction method:
- In Query History, Atlan connects to your database and mines query history directly from
system.query_log. - In Agent, Atlan uses a Self-Deployed Runtime deployed within your network to mine query history. For details on deploying the runtime, see:
- In Offline, you need to first mine query history yourself and make it available in S3.
- In Query History, Atlan connects to your database and mines query history directly from
- For Start date, choose the earliest date from which to mine query history.
- For Advanced Config, keep Default for the default configuration or click Advanced to configure the miner:
- For Cross Connection, click Yes to extract lineage across all available data source connections or click No to only extract lineage from the selected ClickHouse connection.
- For Control Config, if Atlan support has provided you with a custom control configuration, select Custom and enter the configuration into the Custom Config box.
Run miner
To run the ClickHouse miner, after completing the configuration:
- For your first run, set the start date to three days before today, then click Schedule & Run to run the miner daily. The miner requires a 24–48 hour lag to capture all session transformations, so building up history gradually avoids delays. Learn more about miner logic.
- To run the miner once, immediately, at the bottom of the screen, click the Run button.
- To schedule the miner to run hourly, daily, weekly, or monthly, at the bottom of the screen, click the Schedule & Run button.
Once the miner completes running, you see lineage for ClickHouse assets based on query history from system.query_log.
Need help
If you need help configuring the miner or enabling query logging, contact Atlan Support by submitting a request.
See also
- Generate lineage for ClickHouse assets: Enable query logging and configure your ClickHouse instance so the miner can extract query history