Distributed locking issues
Resolve common distributed locking errors related to Redis connectivity, lock acquisition, and Sentinel configuration in your Atlan applications.
Redis connection failure
ATLAN-CLIENT-503-00: Redis connection failure
Cause
The application can't establish a connection to the Redis server. This occurs when Redis is unreachable, authentication credentials are incorrect, or network connectivity issues prevent communication.
Solution
-
Verify Redis server is running:
# Check Redis service status
systemctl status redis -
Test basic connectivity:
# Test without authentication
redis-cli -h $REDIS_HOST -p $REDIS_PORT ping
# Test with authentication
redis-cli -h $REDIS_HOST -p $REDIS_PORT -a $REDIS_PASSWORD ping -
Check environment variables are set correctly:
echo $REDIS_HOST
echo $REDIS_PORT
echo $REDIS_PASSWORD
echo $IS_LOCKING_DISABLED -
Verify network connectivity:
# Test port connectivity
telnet $REDIS_HOST $REDIS_PORT
# Check firewall rules
nmap -p $REDIS_PORT $REDIS_HOST -
Check firewall rules permit connections on the Redis port from your application hosts.
-
Validate Redis authentication credentials match your configuration.
Lock not available
ATLAN-ACTIVITY-503-01: Lock not available
Cause
The activity timed out while waiting for an available lock slot. This occurs when lock contention is high and all slots are occupied for longer than the activity timeout period.
Solution
-
Increase the
schedule_to_close_timeout
for high-contention activities:result = await workflow.execute_activity(
locked_activity,
args=[data],
schedule_to_close_timeout=timedelta(minutes=15) # Increased from 5
) -
Monitor lock contention patterns to identify bottlenecks.
-
Consider increasing
max_locks
if resource capacity supports it:@needs_lock(max_locks=10, lock_name="api_calls") # Increased from 5
async def api_activity(data: dict):
return await process_data(data) -
Split long-running operations into separate activities to reduce lock duration.
-
Adjust
LOCK_RETRY_INTERVAL
to retry more frequently:export LOCK_RETRY_INTERVAL=3 # Reduced from 5 seconds
Sentinel master discovery failure
Failed to discover Redis master through Sentinel
Cause
The application can't connect to Redis through Sentinel instances. This occurs when sentinel instances aren't running, the sentinel service name is incorrect, or network connectivity to sentinels is blocked.
Solution
-
Verify all sentinel instances are running:
# Check sentinel service status on each host
systemctl status redis-sentinel -
Test sentinel connectivity:
# Test connection to each sentinel
redis-cli -h sentinel1.example.com -p 26379 sentinel masters
redis-cli -h sentinel2.example.com -p 26379 sentinel masters
redis-cli -h sentinel3.example.com -p 26379 sentinel masters -
Verify the sentinel service name matches your configuration:
# Check sentinel configuration
redis-cli -h sentinel1.example.com -p 26379 sentinel master mymaster -
Check environment variables are correct:
echo $REDIS_SENTINEL_HOSTS
echo $REDIS_SENTINEL_SERVICE
echo $REDIS_PASSWORD -
Check network connectivity to all sentinel hosts:
# Test connectivity to each sentinel
telnet sentinel1.example.com 26379
telnet sentinel2.example.com 26379
telnet sentinel3.example.com 26379 -
Validate sentinel configuration files have the correct master name and replication setup.
Missing schedule_to_close_timeout
Activity with @needs_lock requires schedule_to_close_timeout
Cause
The activity decorated with @needs_lock
doesn't specify a schedule_to_close_timeout
. The system requires this timeout to calculate lock TTL and prevent deadlocks.
Solution
Add schedule_to_close_timeout
when executing the activity:
# Before (incorrect)
result = await workflow.execute_activity(
locked_activity,
args=[data]
)
# After (correct)
result = await workflow.execute_activity(
locked_activity,
args=[data],
schedule_to_close_timeout=timedelta(minutes=5)
)
See also
- Configure distributed locking: Set up Redis and environment configuration
- @needs_lock decorator: Decorator parameters and usage patterns
- Distributed locking FAQ: Frequently asked questions about distributed locking
Need help
If you need assistance after trying the steps, contact Atlan support: Submit a request.