Crawl Confluent Schema Registry
Configure and run the standalone Schema Registry crawler to extract subjects and schema versions into Atlan. To crawl Schema Registry alongside Confluent Kafka in a single workflow, see Crawl Confluent Kafka instead.
Prerequisites
Before you begin, make sure you have:
- Completed the Confluent Schema Registry setup and have your Schema Registry endpoint, API key, and API secret ready
- Reviewed the order of operations for crawling metadata
- The required permissions in Atlan to create and manage a connection
Create crawler workflow
-
In Atlan, select New > New Workflow.
-
Select Confluent Schema Registry Assets and click Setup Workflow.
-
Choose your extraction method:
- Direct -- Atlan connects to your Confluent Schema Registry and crawls metadata directly over the network.
- Agent -- Self-Deployed Runtime executes metadata extraction within your organization's environment, keeping all connections inside your network perimeter.
Configure extraction
- Direct
- Agent
- For Host, enter your schema registry endpoint.
- For API Key, enter the API key you copied.
- For API Secret, enter the API secret you copied.
- Click Test Authentication to confirm connectivity, then click Next.
Before configuring the crawler:
- Install Self-Deployed Runtime if you haven't already:
- Confirm the runtime can reach your Confluent Schema Registry over your local network and that network security is configured.
To configure the crawler:
- Select the Agent tab in the workflow setup.
- Under Secure Agent Configuration, select your deployed agent from the Agent dropdown and the secret store from the Secret Store dropdown.
- For Host, enter your schema registry endpoint as reachable from within your network.
- For API Key, reference the secret store path where the API key is stored.
- For API Secret, reference the secret store path where the API secret is stored.
- Store sensitive credential values in your secret store and reference them in the corresponding fields. For more information, see Configure secrets for workflow execution.
- Click Next after completing the configuration.
Configure connection
-
Provide a Connection Name that represents your source environment -- for example,
production,development,gold, oranalytics. -
Under Connection Admins, add the users or groups that can manage this connection.
warningIf you don't specify any user or group, no one can manage the connection -- not even admins.
-
At the bottom of the screen, click Next.
Configure crawling options
On the Metadata page, you can override the defaults for any of these options:
- Click Exclude subjects to exclude specific subjects from crawling. Defaults to no exclusions if none are specified.
- Click Include subjects to limit crawling to specific subjects. Defaults to all subjects if none are specified.
If an asset appears in both include and exclude filters, the exclude filter takes precedence.
Run crawler
- Direct
- Agent
- Click Preflight checks to validate permissions and configuration before running.
- After preflight checks pass, either:
- Click Run to run the crawler once immediately.
- Click Schedule & Run to schedule the crawler to run hourly, daily, weekly, or monthly.
Either:
- Click Run to run the crawler once immediately.
- Click Schedule & Run to schedule the crawler to run hourly, daily, weekly, or monthly.
Once the crawler completes, the assets appear on Atlan's asset page.
See also
- What does Atlan crawl from Confluent Schema Registry: Assets and metadata discovered during crawling
- Preflight checks for Confluent Schema Registry: Validation checks for permissions and configuration