Skip to main content

Crawl Cloud SQL for PostgreSQL

Extract metadata assets from your Cloud SQL for PostgreSQL database into Atlan.

Prerequisites

Before you begin, verify you have:

Create crawler workflow

Create a new workflow and select Cloud SQL (PostgreSQL) as your connector source.

  1. In the top-right corner of any screen, select New > New Workflow.
  2. From the list of packages, select Cloud SQL (PostgreSQL) > Setup Workflow.

Choose extraction method

When setting up metadata extraction from your Cloud SQL for PostgreSQL instance, you need to choose how Atlan connects and extracts metadata. Select the extraction method that best fits your organization's security and network requirements:

Atlan SaaS connects directly to your Cloud SQL instance. This method supports multiple authentication options and lets you test the connection before proceeding.

Connection type

Choose whether to use the default connection settings or provide a custom JDBC URL:

  • Host: Use the default JDBC URL based on standard connection parameters (host, port, database name).

  • URL: Provide a custom JDBC URL with specific driver options.

    Custom JDBC URL

    When using the URL option, make sure your connection string conforms to the Cloud SQL JDBC documentation and the PostgreSQL JDBC Driver documentation.

Configure authentication

Choose an authentication method for your direct connection. When using IAM-based authentication, Atlan uses the Cloud SQL Language Connector for added security.

Use standard database credentials created in your Cloud SQL instance.

  • Username: Enter the database username you created in Cloud SQL for PostgreSQL.
  • Password: Enter the password for the specified user.
  • Host: Enter the public IP address of your Cloud SQL instance.
  • Port: Specify the database port number (default is 5432).
  • Database: Enter the name of the database you want to crawl.

After entering the authentication details, click Test Authentication to verify your configuration. If the test is successful, click Next to proceed with the connection configuration.

Configure connection

To complete the connection configuration:

  1. Provide a Connection Name that represents your source environment. For example, you might use values like production, development, gold, or analytics.

  2. (Optional) To change the users able to manage this connection, change the users or groups listed under Connection Admins. If you don't specify any user or group, nobody can manage the connection - not even admins.

  3. At the bottom of the screen, click Next to proceed.

Configure crawler

Before running the crawler, you can further configure it. (These options are only available when using the direct extraction method.)

You can override the defaults for any of these options:

  • To select the assets you want to exclude from crawling, click Exclude Metadata. (This defaults to no assets if none are specified.)
  • To select the assets you want to include in crawling, click Include Metadata. (This defaults to all assets, if none are specified.)
  • To have the crawler ignore tables and views based on a naming convention, specify a regular expression in the Exclude regex for tables & views field.
  • For Advanced Config, keep Default for the default configuration or click Custom to configure the crawler:
    • For Enable Source Level Filtering, click True to enable schema-level filtering at source or click False to disable it.
    • For Use JDBC Internal Methods, click True to enable JDBC internal methods for data extraction or click False to disable it.

If an asset appears in both the include and exclude filters, the exclude filter takes precedence.

Run crawler

To run the Cloud SQL for PostgreSQL crawler:

  1. Run preflight checks (Direct extraction only): Click Preflight checks to validate permissions and configuration before running the crawler. This helps identify any potential issues early. If you're using Agent extraction, skip to step 2.
  2. Execute the crawler: You can either:
    • To run the crawler once immediately, at the bottom of the screen, click the Run button.
    • To schedule the crawler to run hourly, daily, weekly, or monthly, at the bottom of the screen, click the Schedule Run button.

Once the crawler has completed running, you can see the assets in Atlan's asset page! 🎉

See also