How Atlan connects to Hive
Atlan connects to your Hive database to extract technical metadata while maintaining network security and compliance. You can choose between Direct connectivity for databases available from the internet or Self-deployed runtime for databases that must remain behind your firewall.
Connect via direct network connection
Atlan's Hive workflow establishes a direct network connection to your database from the Atlan SaaS tenant. This approach works when your Hive database can accept connections from the internet.
Key characteristics of Direct connectivity:
- Atlan connects to your Hive database from the Atlan SaaS tenant over port 10000 (default HiveServer2 port)
- You provide connection details (hostname, port, credentials, certificates) when creating a crawler workflow
- For Kerberos authentication, Atlan must reach the KDC on port 88 (TCP/UDP) in addition to HiveServer2
- Atlan executes read-only SQL queries to discover your database structure
- Your Hive database accepts inbound network connections from Atlan's IP addresses
- All credentials, keytabs, and certificates are stored encrypted in Atlan Cloud
Connect via self-deployed runtime
A runtime service deployed within your network acts as a secure bridge between Atlan Cloud and your Hive database. This approach works when your Hive database must remain fully isolated behind your firewall.
Key characteristics of Self-deployed runtime:
- A runtime service sits within your network perimeter, deployed on Docker Compose or Kubernetes
- The runtime maintains an outbound HTTPS connection to Atlan Cloud (port 443) and a local network connection to HiveServer2 (port 10000)
- When you create a crawler workflow, Atlan Cloud sends metadata extraction requests to the runtime
- The runtime translates requests into SQL queries, executes them on HiveServer2, and returns results to Atlan Cloud
- Your Hive database never exposes ports to the internet—all connections are initiated from within your network
How it protects your data
Hive databases contain critical business data and operational information. Atlan's connection architecture protects your environment through multiple security layers.
Metadata extraction, not data replication
Atlan extracts only structural metadata—schemas, databases, tables, views, materialized views, columns, and their relationships. The actual business data in your tables remains in your Hive database.
For example, if you have a CUSTOMERS table with customer records, Atlan discovers:
- The table structure (table name, database, schema)
- Column definitions (column names, data types, nullability)
- Relationships (foreign keys, if configured)
Atlan never queries or stores the customer records themselves.
Read-only operations
All database queries are read-only SELECT statements. The connector can't:
- Modify data (INSERT, UPDATE, DELETE)
- Create or drop database objects
- Change any configuration
- Execute stored procedures or functions
- Grant or revoke permissions
The Hive user permissions you grant control exactly what the connector can access.
Credential encryption
Hive connection credentials are encrypted at rest and in transit:
Direct connectivity:
- Credentials are encrypted before storage in Atlan Cloud
- Encryption keys are managed by Atlan's key management system
- Credentials are decrypted only when establishing connections
Self-deployed runtime:
- Basic authentication credentials never leave your network perimeter
- The runtime retrieves credentials from your enterprise-managed secret vaults only when needed
- Kerberos keytabs and certificates are encrypted in object storage
- File downloads from object storage use encrypted channels
Network isolation with Self-deployed runtime
Your Hive database gains complete network isolation from the internet:
- The database only accepts connections from the runtime within your local network
- The runtime itself only makes outbound HTTPS connections to Atlan Cloud
- No inbound connections to your network are required
- Your network team can control runtime connectivity through firewall rules
Authentication security
With Kerberos authentication:
- No passwords are transmitted over the network
- Tickets are time-limited and automatically expire
- Keytabs provide secure, non-interactive authentication
- Mutual authentication verifies both client and server identity
With TLS/MTLS:
- All traffic is encrypted to prevent eavesdropping
- Server identity is verified through certificate validation
- Client identity is verified in MTLS configurations
- Certificate expiration enforces regular security reviews
See also
- Set up Hive: Configure authentication and permissions
- Crawl Hive: Configure and run extraction (including Agent deployment)
- Self-Deployed Runtime architecture: Core components and data flow
- Self-Deployed Runtime security: Security architecture, authentication, and encryption