Crawl Coalesce Private Preview
Configure the Coalesce crawler in Atlan, pair it with your Snowflake connection, and run your first crawl.
Prerequisites
Before you begin, make sure you have:
- A Coalesce API token. See Set up Coalesce.
- An Atlan Snowflake connection that has already crawled successfully.
- The list of Coalesce projects and environments you want to include.
- Confirmation that your Coalesce jobs run at least once every 7 days for every node you want cataloged. Nodes with less frequent run cadences appear blank in Atlan between runs—contact your Atlan team to extend the lookback window if needed.
Create crawler workflow
Configure connection
- In the top right of any screen in Atlan, click New and then New workflow.
- In the Workflow center Marketplace, search for and select the Coalesce package.
- In the Credential step, provide your Coalesce credentials:
- For Host, enter your Coalesce instance URL, for example
https://app.coalescesoftware.io. - For Bearer Token, enter your Coalesce API token.
- Click Test Authentication to confirm Atlan can reach Coalesce, then continue.
- For Host, enter your Coalesce instance URL, for example
- In the Connection step, provide a connection name that identifies this Coalesce instance.
Configure crawling options
- In the Configuration step, set up what Atlan crawls:
- Select the Snowflake connection that Coalesce materializes into. This pairing tells Atlan where to write the
sqlCoalesce*attributes and announcements. - Use the include/exclude filters to restrict the crawl to specific Coalesce projects or environments. Leaving these filters blank crawls every project and environment the token can access—this significantly increases Coalesce API calls per crawl.
- (Optional) Set the crawl schedule. Pick a time that runs after your Coalesce job windows complete. See Rate limits and scheduling for timing guidance.
- Select the Snowflake connection that Coalesce materializes into. This pairing tells Atlan where to write the
Run crawler
- Save the workflow and run it to validate authentication and reachability.
Crawl cadence
| Workspace size | Recommended Atlan crawl cadence |
|---|---|
| Up to ~500 nodes | Hourly is fine. |
| 500–2,000 nodes | Hourly is fine in most cases. Monitor Coalesce rate-limit headroom. |
| More than ~2,000 nodes | The connector has been validated up to ~2,000 nodes. Contact your Atlan customer success manager before connecting larger workspaces. |
Two factors drive cadence:
- Coalesce API call volume: larger workspaces require more API calls per crawl, increasing pressure on Coalesce's rate limits.
- Coalesce run frequency: there's no benefit to crawling Atlan more often than Coalesce itself runs jobs.
Re-running crawls
Re-run the crawl manually at any time from the connector page. Every crawl re-fetches all data from scratch—there's no incremental state between runs and no risk of double-writes.
See also
- What does Atlan crawl from Coalesce: Full inventory of ingested metadata.
- Troubleshooting connectivity: Resolve common issues.