Skip to main content

Crawl CrateDB

Extract metadata from your CrateDB database and make it available in Atlan for data discovery, governance, and lineage tracking. This guide walks you through setting up authentication and running your first crawl.

Prerequisites

Before you begin, make sure you have:

  • Set up CrateDB with proper user permissions
  • Network connectivity between Atlan and your CrateDB instance
  • Your CrateDB cluster HTTP endpoint and port information

Set up workflow

Create a new CrateDB Assets workflow to extract metadata from your database.

  1. Select New > New Workflow.
  2. From the list of packages, select CrateDB Assets.
  3. Click Setup Workflow.

Configure extraction method

Choose how to connect to your CrateDB environment:

  1. Select Direct for the extraction method.
  2. Enter your CrateDB connection details:
    • Host: Your CrateDB cluster HTTP endpoint (for example, https://your-cluster.crate.io)
    • Port: The port number of your CrateDB instance
    • Authentication: Choose Basic authentication
    • Username: Enter the username you configured in CrateDB
    • Password: Enter the password you configured in CrateDB
    • Database: Enter the name of the database to crawl
  3. Click Test Authentication to confirm connectivity to CrateDB using these details.
  4. When successful, click Next.

Configure connection details

  1. Enter a Connection Name to identify your CrateDB environment. For example, production-cratedb, analytics-db, data-warehouse.
  2. Assign Connection Admins to manage access. At least one admin is required.

Configure crawler settings

Before running the CrateDB crawler, you can configure additional settings:

  • Exclude Metadata: Select assets you want to exclude from crawling
  • Include Metadata: Select assets you want to include in crawling
  • Exclude regex for tables & views: Specify a regular expression to ignore tables and views based on naming conventions
  • Advanced Config:
    • Enable Source Level Filtering: Enable schema-level filtering at source
    • Use JDBC Internal Methods: Enable JDBC internal methods for data extraction
Did you know?

If an asset appears in both the include and exclude filters, the exclude filter takes precedence.

Run crawler

You can now start extracting metadata from your CrateDB database:

  • Run now: Click Run to start a one-time crawl.
  • Schedule runs: Click Schedule Run to automate recurring crawls (hourly, daily, weekly, or monthly).

Monitor crawl progress in the activity log. Once complete, your CrateDB assets appear in Atlan.

Troubleshooting

If you encounter connection or authentication issues, see Connection issues.

See also