Skip to main content

Set up on-premises database access

Sunsetting notice

This job execution mode is scheduled for deprecation on June 30, 2026. For new implementations, use Self-deployed runtime. If you have existing setups, contact your Account team to plan your migration.

In some cases you won't be able to expose your databases for Atlan to crawl and ingest metadata. This may happen for various reasons:

  1. Transactional databases may have high-load mechanisms. That may make direct connection problematic.
  2. Security requirements may restrict accessing sensitive, mission critical transactional databases from outside.

In such cases you may want to decouple the extraction of metadata from its ingestion in Atlan. This approach gives you full control over your resources and metadata transfer to Atlan.

Prerequisites

To extract metadata from your on-premises databases you need to use Atlan's metadata-extractor tool.

Did you know?

Atlan uses exactly the same metadata-extractor behind the scenes when it connects to cloud databases.

warning

If you have already installed Docker Compose, make sure the version is 1.17.0 or higher. It's good practice to upgrade the tool to the latest available version.

Install Docker Compose

Docker Compose is a tool for defining and running applications composed of many Docker containers. (Any guesses where the name came from? 😉)

To install Docker Compose:

  1. Install Docker
  2. Install Docker Compose
Did you know?

Instructions provided in this documentation are typically enough even if you are completely new to Docker and Docker Compose. But you can also walk through the Get started with Docker Compose tutorial if you want to learn Docker Compose basics first.

Get metadata-extractor tool

To get the metadata-extractor tool:

  1. Raise a support ticket to get a link to the latest version.

  2. Download the image using the link provided by support.

  3. To load the image:

    • For Docker Image, load the image to the server you'll use to crawl databases:

      sudo docker load -i /path/to/jdbc-metadata-extractor-master.tar
    • For OCI Image:

      • Docker:

        • Install Skopeo.

        • Load the image to the server you'll use to crawl databases:

          skopeo copy oci-archive:/path/to/jdbc-metadata-extractor-master-oci.tar docker-daemon:jdbc-metadata-extractor-master:latest
      • Podman:

        • Load the image to the server you'll use to crawl databases:

          podman load -i /path/to/jdbc-metadata-extractor-master-oci.tar
          podman tag <loaded image hash> jdbc-metadata-extractor-master:latest

Get compose file

Atlan provides you a configuration file for the metadata-extractor tool. This is a Docker compose file.

To get the compose file:

  1. Download the latest compose file.
  2. Save the file to an empty directory on the server you'll use to access your on-premises databases.
  3. The file is docker-compose.yml.

Define database connections

The structure of the compose file includes three main sections:

  • x-templates contains configuration fragments. Ignore this section - don't make any changes to it.
  • services is where you define your database connections.
  • volumes contains mount information. Ignore this section as well - don't make any changes to it.

Define services

For each on-premises database, define an entry under services in the compose file.

Each entry has the following structure:

services:
CONNECTION-NAME:
<<: *extract
environment:
<<: *CONNECTION-TYPE
# Credentials
# Database address
volumes:
# Output folder
  • Replace CONNECTION-NAME with the name of your connection.
  • <<: *extract tells the metadata-extractor tool to run.
  • environment contains all parameters for the tool.
  • <<: *CONNECTION-TYPE applies default arguments for the corresponding connection type.

Refer to Supported connections for on-premises databases for full details of each connection type.

Example

The following example explains the structure in detail:

services:
inventory: # 1. Call this connection "inventory"
<<: *extract
environment:
<<: *psql # 2. Connect to PostgreSQL using basic authentication
USERNAME: some-username # 3. Credentials
PASSWORD: some-password
HOST: inventory.local # 4. Database address
PORT: 5432
DATABASE: inventory
volumes:
- *shared-jdbc-drivers
- ./output/inventory:/output # 5. Store results in ./output/inventory
  1. The name of this service is inventory. You can use any meaningful name you want. In this example, the same name as the database to crawl is used.
  2. The <<: *psql sets the connection type to Postgres using basic authentication.
  3. USERNAME and PASSWORD specify the credentials required for the psql connection.
  4. HOST, PORT and DATABASE specify the database address. The PORT is 5432 by default, so you can omit it most of the time.
  5. The ./output/inventory:/output line specifies where to store results. Replace inventory with the name of your connection. Output metadata for different databases in separate folders.

You can add as many database connections as you want.

Did you know?

Docker's documentation describes the services format in more detail.

Secure credentials

warning

If you decide to keep database credentials in the compose file, restrict access to the directory and compose file. For extra security, use Docker secrets to store the sensitive passwords.

To create and use Docker secrets:

  1. Create a JSON file and add the credentials that you want to use in Docker secrets. For example:

    {
    "USERNAME": "my-secret-user",
    "PASSWORD": "my-secret-password"
    }
    info

    💪 Did you know? The keys here are the environment variable names, hence consider migrating them from the compose file to secrets. Once set to secrets, the environment variables in secrets take precedence over the ones in the compose file. If not provided in secrets, the values are parsed from the compose file instead.

  2. Create a new Docker secret:

    docker secret create my_database_credentials credentials.json
  3. At the top of your compose file, add a secrets element to access your secret:

    secrets:
    my_database_credentials:
    external: true
  4. Within the service section of the compose file, add a new secrets element and specify CREDENTIAL_SECRET_PATH to use it as credentials.

    warning

    If you have added database credentials directly to the compose file, Atlan recommends that you leave CREDENTIAL_SECRET_PATH as blank.

For example, your compose file may look something like this:

secrets:
my_database_password:
external: true

x-templates:
# ...

services:
my-database:
<<: *extract
environment:
<<: *psql
CREDENTIAL_SECRET_PATH: "/run/secrets/my_database_credentials"
# ...
volumes:
# ...
secrets:
- my_database_password

volumes:
jars:

Troubleshooting secure credentials

Atlan recommends the following troubleshooting measures:

  • If you're unable to create Docker secrets, make sure Swarm mode is enabled. Secrets are encrypted during transit and at rest in a Docker swarm. Run the following command to enable Swarm mode:

    docker swarm init
  • If running the compose file after providing the credentials secret results in Unsupported external secret <secret_name>, complete the following steps:

    1. Modify the compose file as follows:

      secrets:
      my_database_password:
      external: true

      x-templates:
      # ...

      services:
      my-database:
      <<: *extract
      environment:
      <<: *psql
      CREDENTIAL_SECRET_PATH: "/run/secrets/my_database_credentials"
      # ...
      volumes:
      # ...
      secrets:
      - my_database_password
      deploy:
      replicas: 1
      restart_policy:
      condition: none

      volumes:
      jars:
    2. Run the compose file using the following command:

      docker stack deploy -c docker-compose.yml <stack_name>
      • Replace the <stack_name> with the name you provided while deploying the stack.
    3. Verify deployment status using the following command:

      docker stack ps --no-trunc <stack_name>
      • Replace the <stack_name> with the name you provided while deploying the stack.
    4. If stack deployment has been successfully completed, monitor the docker service logs using the following command:

      docker service logs <stack_name>_<service_name> --follow
      • Replace the <stack_name> with the name you provided while deploying the stack.
      • Replace the <service_name> with the service name in Docker.
      warning

      The docker stack deploy command runs all the services in the docker-compose.yml file, so make sure the docker-compose.yml only contains the service you intend to run.