Crawl Oracle
Create a crawler workflow to automatically discover and catalog your Oracle assets, including databases, schemas, tables, views, and columns.
Prerequisites
Before you begin, make sure you have:
- Configured Oracle user permissions with metadata read access
- Oracle database connection details (host, port, SID, credentials)
- Reviewed the order of operations for workflow execution
Create crawler workflow
Create a new Oracle crawler workflow in Atlan by selecting the Oracle connector package, configuring your extraction method and connection details, and running the crawler to extract metadata.
- In the top right of any screen, navigate to New and then click New Workflow.
- From the list of packages, select Oracle Assets and click Setup Workflow.
Configure extraction
Select your extraction method and provide the connection details.
- Direct
- Offline
- Agent
In Direct extraction, Atlan connects to your database and crawls metadata directly.
- For Host Name, enter the host for your Oracle instance.
- For Port, enter the port number of your Oracle instance.
- For Username and Password, enter the credentials you created when configuring permissions.
- For SID, enter the Oracle system identifier for your database.
- For Default Database Name, enter the database name (usually the same as the SID).
- Click Test Authentication to confirm connectivity to Oracle using these details.
In Offline extraction, you use Atlan's metadata-extractor tool to extract metadata from Oracle and store it in S3.
- Complete the offline extraction setup to extract and upload metadata files to S3.
- For Bucket name, enter the name of your S3 bucket or Atlan's bucket.
- For Bucket prefix, enter the S3 prefix under which all the metadata files exist. These include
databases.json,columns-<database>.json, and similar files. - For Bucket region, enter the name of the S3 region if your bucket is in a specific region.
- Click Next at the bottom of the screen.
In Agent extraction, Self-Deployed Runtime executes metadata extraction within your organization's environment.
- Install Self-Deployed Runtime if you haven't already.
- Select the Agent tab.
- Store sensitive information in the secret store configured with the Self-Deployed Runtime and reference the secrets in the corresponding fields. For more information, see Configure secrets for workflow execution.
- For details on individual fields, refer to the Direct extraction tab.
- Click Next after completing the configuration.
Configure connection
Set up connection details including a descriptive name, admin access, and data access permissions.
- Provide a Connection Name that represents your source environment. For example, you might use values like
production,development,gold, oranalytics. - To change the users able to manage this connection, update the users or groups listed under Connection Admins. If you don't specify any user or group, nobody can manage the connection, including admins.
- To prevent users from querying Oracle data, change Allow SQL Query to No. This option applies only to Direct extraction.
- To prevent users from previewing Oracle data, change Allow Data Preview to No. This option applies only to Direct extraction.
- Click Next at the bottom of the screen.
Configure crawler
Configure crawler settings to control which assets to include or exclude. If an asset appears in both filters, the exclude filter takes precedence.
- To select specific assets for crawling, click Include Metadata. By default, all assets are included.
- To exclude specific assets from crawling, click Exclude Metadata. By default, no assets are excluded.
- To ignore tables and views based on a naming pattern, enter a regular expression in the Exclude regex for tables & views field.
Run crawler
Run preflight checks to validate your configuration, then execute the crawler immediately or schedule it to run on a recurring basis.
- To verify permissions and configuration before running, click Preflight checks. This option is available for Direct extraction only.
- Choose your run option:
- To run the crawler once immediately, click Run at the bottom of the screen.
- To schedule the crawler to run hourly, daily, weekly, or monthly, click Schedule Run at the bottom of the screen.
Once the crawler completes, you can view the assets in Atlan's asset page.
See also
- What does Atlan crawl from Oracle: Complete reference of assets and metadata discovered during crawling
- Preflight checks for Oracle: Validation checks for permissions and configuration before running the crawler