Steps to integrate your Hive with Atlan

Atlan natively supports the Apache Hive metastore which allows you to seamlessly integrate your metadata with your Atlan Workspace.

๐Ÿ’ญ TL;DR

You can set up a Hive integration with your Atlan workspace in 4 easy steps:

  1. Select the Source aka Hive ๐Ÿ˜‰

  2. Provide your credentials โœ๏ธ

  3. Set up your configuration ๐Ÿ—„๏ธ

  4. Schedule automatic updates ๐Ÿ•‘

๐Ÿ“œ Prerequisites for Hive Integration

Before you get started with integrating your Apache Hive with Atlan, you'll need some prerequisite information which will help establish a connection between Atlan and your Hive Account:

  • Hostname - Hostname is the IP address of the Hive server to which you are connecting.

  • Hive Server Port - is the number of the TCP port that the Hive server machine uses to listen for client connections. The default Hive server port is 10000

  • Hive Metastore Port - Server port used for accessing metadata about hive tables and partitions. The default Hive metastore port is 9083

  • AWS Access key & Secret Key - Access keys consist of an access key ID and a secret access key, which are used to sign programmatic requests that you make to AWS. Visit AWS documentation around Access keys to know more about access keys and how to create them.

    • S3 Permissions required for enabling data profile.

      "Version": "2012-10-17",
      "Statement": [
      "Sid": "VisualEditor0",
      "Effect": "Allow",
      "Action": [
      "Resource": [
      "Sid": "VisualEditor1",
      "Effect": "Allow",
      "Action": [
      "Resource": "*"
  • Default Schema - The schema used to store the Hive table. If not sure, use default.

๐ŸŒŸ Pro Tip: If you don't have this information handy, reach out to your cloud or data lake administrator to get these details before you get started!

๐Ÿš€ The step-by-step guide to integrate Hive with Atlan

Once you have the prerequisite information listed in the section above, please follow the steps below to establish a connection and integrate Atlan with your Hive metastore.

STEP 1: Selecting the Source

  1. Log into your Atlan Workspace

  2. On the Home Screen, click on the "New Integration" button in the top right corner. You will see a Dialogue box with the list of sources available on your workspace

  3. Select "Hive" from the list of options and click on "Next"

STEP 2: Providing Credentials

  1. You will see an option to either select a pre-configured credential from the drop-down or to create a credential. To set up a new connection, click on the "Create Credential" button.

  2. You will be required to fill in your Hive credentials. Below is an example of the credentials required: Hostname - Hive server Port - 10000 Hive metastore port - 9083 Schema - default AWS Access key - AKIA5XXXXXXXXXXWIJUS AWS Secret key - R1xXXXXXXXXX5PEdHOUXXXXXXXX7Ooz47

  3. Once you have filled in the details, click on "Next".

STEP 3: Setting up Configuration

  1. You will now be asked to fill in the details of your database and table. You can also choose the entire schema and table by selecting the checkbox. Below is an example - Add Schema - Sales Master Add Table - Daily Sales

  2. Chose whether to run the crawler once or schedule it for a Daily, a Weekly, or a Monthly run. You would be asked to specify the time zone to trigger the run.

  3. Click on "Create". Your connection is now created.

Congratulations, you have now integrated Atlan with your Hive metastore! ๐ŸŽ‰

๐Ÿ Monitoring your Hive metastore integration

Once the integration setup is completed, you will be redirected to the Monitor tab for your Hive asset where you can monitor the progress.