Distributed locking issues

Resolve common distributed locking errors related to Redis connectivity, lock acquisition, and Sentinel configuration in your Atlan applications.

Redis connection failure

Error

ATLAN-CLIENT-503-00: Redis connection failure

Cause

The application can't establish a connection to the Redis server. This occurs when Redis is unreachable, authentication credentials are incorrect, or network connectivity issues prevent communication.

Solution

Verify Redis server is running:

# Check Redis service status
systemctl status redis

Test basic connectivity:

# Test without authentication
redis-cli -h $REDIS_HOST -p $REDIS_PORT ping

# Test with authentication
redis-cli -h $REDIS_HOST -p $REDIS_PORT -a $REDIS_PASSWORD ping

Check environment variables are set correctly:

echo $REDIS_HOST
echo $REDIS_PORT
echo $REDIS_PASSWORD
echo $IS_LOCKING_DISABLED

Verify network connectivity:

# Test port connectivity
telnet $REDIS_HOST $REDIS_PORT

# Check firewall rules
nmap -p $REDIS_PORT $REDIS_HOST

Check firewall rules permit connections on the Redis port from your application hosts.
Validate Redis authentication credentials match your configuration.

Lock not available

Error

ATLAN-ACTIVITY-503-01: Lock not available

Cause

The activity timed out while waiting for an available lock slot. This occurs when lock contention is high and all slots are occupied for longer than the activity timeout period.

Solution

Increase the schedule_to_close_timeout for high-contention activities:

result = await workflow.execute_activity(
    locked_activity,
    args=[data],
    schedule_to_close_timeout=timedelta(minutes=15)  # Increased from 5
)

Monitor lock contention patterns to identify bottlenecks.

Consider increasing max_locks if resource capacity supports it:

@needs_lock(max_locks=10, lock_name="api_calls")  # Increased from 5
async def api_activity(data: dict):
    return await process_data(data)

Split long-running operations into separate activities to reduce lock duration.

Adjust LOCK_RETRY_INTERVAL to retry more frequently:

export LOCK_RETRY_INTERVAL=3  # Reduced from 5 seconds

Sentinel master discovery failure

Error

Failed to discover Redis master through Sentinel

Cause

The application can't connect to Redis through Sentinel instances. This occurs when sentinel instances aren't running, the sentinel service name is incorrect, or network connectivity to sentinels is blocked.

Solution

Verify all sentinel instances are running:

# Check sentinel service status on each host
systemctl status redis-sentinel

Test sentinel connectivity:

# Test connection to each sentinel
redis-cli -h sentinel1.example.com -p 26379 sentinel masters
redis-cli -h sentinel2.example.com -p 26379 sentinel masters
redis-cli -h sentinel3.example.com -p 26379 sentinel masters

Verify the sentinel service name matches your configuration:

# Check sentinel configuration
redis-cli -h sentinel1.example.com -p 26379 sentinel master mymaster

Check environment variables are correct:

echo $REDIS_SENTINEL_HOSTS
echo $REDIS_SENTINEL_SERVICE
echo $REDIS_PASSWORD

Check network connectivity to all sentinel hosts:

# Test connectivity to each sentinel
telnet sentinel1.example.com 26379
telnet sentinel2.example.com 26379
telnet sentinel3.example.com 26379

Validate sentinel configuration files have the correct master name and replication setup.

Missing schedule_to_close_timeout

Error

Activity with @needs_lock requires schedule_to_close_timeout

Cause

The activity decorated with @needs_lock doesn't specify a schedule_to_close_timeout. The system requires this timeout to calculate lock TTL and prevent deadlocks.

Solution

Add schedule_to_close_timeout when executing the activity:

# Before (incorrect)
result = await workflow.execute_activity(
    locked_activity,
    args=[data]
)

# After (correct)
result = await workflow.execute_activity(
    locked_activity,
    args=[data],
    schedule_to_close_timeout=timedelta(minutes=5)
)

Need help

If you need assistance after trying the steps, contact Atlan support: Submit a request.

Redis connection failure​

Cause​

Solution​

Lock not available​

Cause​

Solution​

Sentinel master discovery failure​

Cause​

Solution​

Missing schedule_to_close_timeout​

Cause​

Solution​

See also​

Need help​

Redis connection failure

Cause

Solution

Lock not available

Cause

Solution

Sentinel master discovery failure

Cause

Solution

Missing schedule_to_close_timeout

Cause

Solution

See also

Need help