Skip to main content

Mine PostgreSQL

Use the PostgreSQL Miner workflow to extract query history from pg_stat_statements and build lineage for your crawled PostgreSQL assets. This page walks you through creating, configuring, and running the miner.

Prerequisites

Before you begin:

  • Make sure you have crawled PostgreSQL at least once so a connection exists to mine.
  • Make sure pg_stat_statements is enabled on your PostgreSQL instance. The miner extracts query history from pg_stat_statements and can't run without it.

Create miner workflow

To create the PostgreSQL miner workflow:

  1. In the top right of any screen, navigate to +New and then click New workflow.
  2. Under Marketplace, from the filters along the top, click Miner.
  3. From the list of packages, select PostgreSQL Miner and then click Setup Workflow.

Configure miner

The PostgreSQL miner uses pg_stat_statements, which doesn't support time-based filtering like some other sources. The miner extracts the current query statistics; for incremental updates, run the miner on a schedule.

To configure the PostgreSQL miner:

  1. For Connection, select the connection to mine. (To select a connection, the crawler must have already run.)
  2. For Miner Extraction Method, choose your extraction method:
    • In Query History, Atlan connects to your database and mines query history directly from pg_stat_statements.
    • In Agent, Atlan uses a Self-Deployed Runtime deployed within your network to mine query history. For details on deploying the runtime, see:
    • In Offline, you need to first mine query history yourself and make it available in S3.
  3. For Advanced Config, keep Default for the default configuration or click Advanced to configure the miner:
    • For Cross Connection, click Yes to extract lineage across all available data source connections or click No to only extract lineage from the selected PostgreSQL connection.
    • For Control Config, if Atlan support has provided you with a custom control configuration, select Custom and enter the configuration into the Custom Config box.

Run miner

To run the PostgreSQL miner, after completing the configuration:

  • To run the miner once, immediately, at the bottom of the screen, click the Run button.
  • To schedule the miner to run hourly, daily, weekly, or monthly, at the bottom of the screen, click the Schedule & Run button.

Once the miner completes running, you see lineage for PostgreSQL assets based on query history from pg_stat_statements.

Need help

If you need help configuring the miner or enabling pg_stat_statements, contact Atlan Support by submitting a request.

See also