Skip to main content

Crawl AlloyDB for PostgreSQL

Extract metadata assets from your AlloyDB for PostgreSQL database into Atlan.

Prerequisites

Before you begin, verify you have:

Create crawler workflow

Create a new workflow and select AlloyDB (PostgreSQL) as your connector source.

  1. In the top-right corner of any screen, select New > New Workflow.
  2. From the list of packages, select AlloyDB (PostgreSQL) > Setup Workflow.

Configure extraction

When setting up metadata extraction from your AlloyDB for PostgreSQL instance, you need to choose how Atlan connects and extracts metadata. Select the extraction method that best fits your organization's security and network requirements:

Atlan SaaS connects directly to your AlloyDB instance (typically via the public endpoint and AlloyDB connectors). This method supports multiple authentication options and lets you test the connection before proceeding.

  1. Choose whether to use the default connection settings or provide a custom Postgres Driver URL:

    • Host: Use the default Postgres Driver URL based on standard connection parameters (host, port, database name).
    • URL: Provide a custom Postgres Driver URL with specific driver options. Make sure your connection string conforms to the PostgreSQL Driver documentation and applicable to AlloyDB.
  2. Choose an authentication method for your direct connection. For IAM-based authentication, use the AlloyDB connectors/Auth Proxy to generate database auth tokens.

  1. Use standard database credentials created in your AlloyDB for PostgreSQL instance.

    • Username: Enter the database username you created.
    • Password: Enter the password for the specified user.
    • Host: Enter the IP address or hostname exposed for your AlloyDB instance.
    • Port: Specify the database port number (default is 5432).
    • Database: Enter the name of the database you want to crawl.
  2. After entering the authentication details, click Test Authentication to verify your configuration. If the test is successful, click Next to proceed with the connection configuration.

Advanced options

  • SQLAlchemy Args: Comma separated list of arguments which are passed to SQLAlchemy engine as connect_args

Configure connection

Set up the connection name and access controls for your AlloyDB for PostgreSQL data source in Atlan.

  1. Provide a Connection Name that represents your source environment. For example, you might use values like production, development, gold, or analytics.
  2. To change the users able to manage this connection, update the users or groups listed under Connection Admins. If you don't specify any user or group, nobody can manage the connection (not even admins).
  3. At the bottom of the screen, click Next to proceed.

Configure crawler

Before running the crawler, you can configure which assets to include or exclude. These options are only available when using the direct extraction method. If an asset appears in both the include and exclude filters, the exclude filter takes precedence.

  • To exclude specific assets from crawling, select Exclude Metadata. This defaults to no assets if none are specified.
  • To include specific assets in crawling, select Include Metadata. This defaults to all assets if none are specified.
  • To ignore tables and views based on a naming convention, specify a regular expression in the Exclude regex for tables & views field.

Run crawler

  1. Click Preflight checks to validate permissions and configuration before running the crawler. This helps identify any potential issues early.
  2. After the preflight checks pass, you can either:
    • Click Run to run the crawler once immediately.
    • Click Schedule Run to schedule the crawler to run hourly, daily, weekly, or monthly.

Once the crawler has completed running, you can see the assets in Atlan's asset page! 🎉

See also