Skip to main content

Troubleshooting Hive connectivity

This guide helps you resolve common issues when connecting Atlan to Hive, including authentication failures, Kerberos errors, certificate problems, and network connectivity issues.

Basic authentication issues

Invalid username or password

Problem: Authentication fails with "Invalid username or password" error

Cause: Credentials are incorrect or the user account is locked/disabled

Solution:

  1. Verify credentials by testing them with a Hive client (beeline):
    beeline -u "jdbc:hive2://hostname:10000/default" -n username -p password
  2. Check if the user account is active in your authentication system
  3. Verify the username format matches your Hive metastore configuration
  4. Verify no special characters in the password are being escaped incorrectly

User lacks permissions

Problem: Connection succeeds but no metadata is extracted

Cause: User account lacks SELECT permissions on database objects

Solution:

  1. Grant SELECT permissions on all databases you want to crawl:
    GRANT SELECT ON DATABASE database_name TO USER username;
  2. Verify permissions are applied:
    SHOW GRANT USER username;
  3. Check HDFS ACLs, LDAP groups, and any policy engines (Ranger, Sentry) that may restrict access

Kerberos authentication issues

KDC unreachable

Problem: Error message Cannot contact any KDC for realm REALM_NAME

Cause: Network can't reach the Kerberos Key Distribution Center

Solution:

  1. Verify KDC hostname/IP is correct in krb5.conf
  2. Test network connectivity to KDC:
    telnet kdc-hostname 88
    nc -zv kdc-hostname 88
  3. Verify firewall rules permit traffic to port 88 (TCP and UDP)
  4. For Self-Deployed Runtime, verify the runtime can reach the KDC from within your network
  5. Check krb5.conf [realms] section has correct KDC addresses:
    [realms]
    YOUR.REALM = {
    kdc = kdc.example.com:88
    admin_server = kdc.example.com:749
    }

Server not found in Kerberos database

Problem: Error message "Server <service_principal> not found in Kerberos database"

Cause: The service principal name doesn't match what's registered in the KDC

Solution:

  1. Verify the service principal exists in the KDC:
    kadmin.local -q "listprincs hive/*"
  2. Check the exact service principal format with your Hadoop administrator
  3. Verify the service name in Atlan configuration matches your Hive setup (typically hive)
  4. Confirm the hostname matches:
    • Use FQDN (fully qualified domain name) for the Hive host
    • Check DNS resolution:
      nslookup hostname
      host hostname

Wrong realm derived from hostname

Problem: Error message "Server krbtgt/WRONG.[email protected] not found in Kerberos database"

Cause: DNS hostname canonicalization is deriving the wrong realm from the server hostname

Solution:

  1. Disable DNS canonicalization in krb5.conf:
    [libdefaults]
    dns_canonicalize_hostname = false
    rdns = false
  2. Add explicit domain-to-realm mappings:
    [domain_realm]
    .your-domain.com = YOUR.REALM
    your-domain.com = YOUR.REALM
    .amazonaws.com = YOUR.REALM
    amazonaws.com = YOUR.REALM
    .compute.internal = YOUR.REALM
    compute.internal = YOUR.REALM
  3. Verify the mapping works by testing kinit locally
  4. Re-upload the updated krb5.conf file in Atlan

This issue commonly occurs when connecting to cloud-hosted Hive clusters (AWS, Azure, GCP) where the FQDN includes cloud provider domains.

Invalid keytab file

Problem: Error message "Key version number for principal in key table is incorrect"

Cause: Keytab file doesn't match the current principal keys in the KDC

Solution:

  1. Regenerate the keytab file:
    kadmin.local -q "ktadd -k /path/to/new.keytab principal@REALM"
  2. Test the new keytab locally before uploading:
    kinit -kt /path/to/new.keytab principal@REALM
    klist
    kdestroy
  3. Upload the new keytab file in Atlan
  4. If the principal was recently changed or password reset, generate a fresh keytab

Ticket expiration issues

Problem: Connection works initially but fails after several hours

Cause: Kerberos tickets expired and couldn't be renewed

Solution:

  1. Check ticket lifetime settings in krb5.conf:
    [libdefaults]
    ticket_lifetime = 24h
    renew_lifetime = 7d
  2. Atlan automatically renews tickets, but if workflows run longer than the renewable lifetime, they'll fail
  3. Consider shorter extraction schedules or longer renewal lifetimes
  4. Verify the keytab file is valid and can generate new tickets

SASL handshake failed

Problem: Error message "TSocket read 0 bytes" or "SASL handshake failed"

Cause: Multiple possible causes related to Kerberos configuration or network

Solution:

  1. Verify the service principal format is correct:
    • Expected: hive/hostname@REALM
    • Service name (typically hive) must match HiveServer2 configuration
  2. Check that HiveServer2 is configured for Kerberos:
    <!-- In hive-site.xml -->
    <property>
    <name>hive.server2.authentication</name>
    <value>KERBEROS</value>
    </property>
  3. Verify network connectivity to HiveServer2 port (default 10000):
    telnet hive-hostname 10000
  4. Check HiveServer2 logs for details on why it rejected the connection
  5. Verify the client can resolve the HiveServer2 hostname properly:
    • For Self-Deployed Runtime, add hostAliases (Kubernetes) or extra_hosts (Docker) if DNS resolution fails

TLS/MTLS certificate issues

CA certificate validation failed

Problem: Error message "SSL certificate verification failed" or "Unable to verify server certificate"

Cause: CA certificate doesn't match the certificate presented by HiveServer2

Solution:

  1. Obtain the correct CA certificate that signed your HiveServer2's SSL certificate
  2. Verify the certificate chain from your server:
    openssl s_client -connect hostname:10000 -showcerts
  3. Verify the CA certificate file is valid:
    openssl x509 -in ca-cert.pem -text -noout
  4. Check certificate expiration date
  5. Upload the correct CA certificate in Atlan

Client certificate rejected

Problem: Error message "Client certificate rejected" or "SSL handshake failed"

Cause: HiveServer2 doesn't trust the client certificate or the certificate is invalid

Solution:

  1. Verify HiveServer2 is configured to accept your client certificate
  2. Check that the client certificate is signed by a CA trusted by HiveServer2
  3. Verify certificate and key pair match:
    # Extract public key from certificate
    openssl x509 -in client-cert.pem -pubkey -noout > cert-pubkey.pem

    # Extract public key from private key
    openssl pkey -in client-key.pem -pubout > key-pubkey.pem

    # Compare (should be identical)
    diff cert-pubkey.pem key-pubkey.pem
  4. Check certificate expiration:
    openssl x509 -in client-cert.pem -noout -dates
  5. Verify certificate includes required extensions (Extended Key Usage: Client Authentication)

Client key passphrase incorrect

Problem: Error message Could not decrypt key or Invalid passphrase

Cause: The passphrase for the encrypted client key is incorrect or the key format is wrong

Solution:

  1. Verify the passphrase is correct by testing locally:
    openssl rsa -in client-key.pem -check
  2. If the key is encrypted and you don't need encryption, decrypt it:
    openssl rsa -in client-key-encrypted.pem -out client-key-plain.pem
  3. Re-upload the key file (encrypted or plain) and provide the correct passphrase if encrypted
  4. Verify no extra whitespace in the passphrase field

Certificate format issues

Problem: Error message Could not load certificate or Invalid certificate format

Cause: Certificate or key file is in the wrong format or corrupted

Solution:

  1. Convert certificates to PEM format if needed:
    # From DER to PEM
    openssl x509 -inform der -in certificate.der -out certificate.pem

    # From P12 to PEM
    openssl pkcs12 -in certificate.p12 -out certificate.pem -nodes
  2. Verify file format:
    # Should show "BEGIN CERTIFICATE" for certs
    head certificate.pem

    # Should show "BEGIN PRIVATE KEY" or "BEGIN RSA PRIVATE KEY" for keys
    head client-key.pem
  3. Verify files aren't corrupted during upload (try uploading as .zip if direct upload fails)

Network connectivity issues

Connection timeout

Problem: Error message "Connection timeout" or "Failed to connect to hostname:port"

Cause: Network can't reach HiveServer2 or firewall blocks the connection

Solution:

  1. Test network connectivity:
    telnet hostname 10000
    nc -zv hostname 10000
  2. Verify HiveServer2 is running:
    # Check HiveServer2 process
    jps | grep HiveServer2
  3. Check that firewall rules permit traffic to port 10000
  4. For Direct connectivity, verify Hive server accepts connections from Atlan's IP addresses
  5. For Self-Deployed Runtime, verify the runtime can reach Hive from within your network

Port already in use

Problem: HiveServer2 won't start with "Address already in use" error

Cause: Another process is using port 10000

Solution:

  1. Identify the process using the port:
    lsof -i :10000
    netstat -tuln | grep 10000
  2. Stop the conflicting process or configure HiveServer2 to use a different port
  3. Update the port number in Atlan configuration if using a non-standard port

DNS resolution issues

Problem: Error message Could not resolve hostname

Cause: DNS can't resolve the HiveServer2 hostname

Solution:

  1. Verify DNS resolution:
    nslookup hostname
    host hostname
    dig hostname
  2. Use IP address instead of hostname if DNS is unreliable
  3. For Self-Deployed Runtime with problematic DNS:
    • Kubernetes: Add hostAliases to pod spec:
      hostAliases:
      - ip: "10.0.1.100"
      hostnames:
      - "hadoop-master.company.com"
    • Docker Compose: Add extra_hosts:
      extra_hosts:
      - "hadoop-master.company.com:10.0.1.100"
  4. Verify /etc/hosts entries if using host file resolution

Metastore limitations

Unable to connect to metastore

Problem: HiveServer2 can't connect to the Hive metastore database

Cause: Metastore database is down or connection is misconfigured

Solution:

  1. Verify metastore database (MySQL, PostgreSQL) is running
  2. Check HiveServer2 metastore configuration in hive-site.xml:
    <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://metastore-host:3306/hive</value>
    </property>
  3. Test metastore database connectivity from HiveServer2 host
  4. Check metastore database credentials are correct
  5. Review HiveServer2 logs for specific metastore errors

Metastore schema version mismatch

Problem: Error message "Metastore schema version doesn't match"

Cause: Hive metastore schema version incompatible with HiveServer2 version

Solution:

  1. Check current schema version:
    SELECT * FROM VERSION;
  2. Run schema upgrade tool:
    schematool -dbType mysql -upgradeSchema
  3. Backup metastore database before upgrading
  4. Consult Hadoop administrator if unsure about schema compatibility

Performance issues

Slow metadata extraction

Problem: Metadata extraction takes hours to complete

Cause: Large number of databases/tables or inefficient queries

Solution:

  1. Use include/exclude filters to limit scope of extraction
  2. Exclude system schemas and temporary tables
  3. Schedule extractions during low-usage periods
  4. For huge metastores (millions of tables), consider offline extraction method
  5. Check HiveServer2 and metastore database performance

Out of memory errors

Problem: HiveServer2 or Atlan workflow fails with out-of-memory error

Cause: Extracting too much metadata or insufficient memory allocation

Solution:

  1. Increase HiveServer2 heap size if the server is running out of memory
  2. Use include filters to reduce the scope of extraction
  3. Break extraction into multiple workflows for different database groups
  4. For Self-Deployed Runtime, increase memory allocation:
    resources:
    limits:
    memory: 16Gi
    requests:
    memory: 16Gi

See also