You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are working with a Redis cluster consisting of multiple shards, with 3 nodes per shard. When a node unexpectedly restarts (due to hardware reset, for example), the first get call to the affected node throws an exception as expected. Upon encountering this exception, we reconnect to the cluster, however, initially, the cluster nodes do not report the affected node as failed.
The new cluster connection remains functional until the point where it tries to retrieve an element from the affected shard and establishes a connection to the impaired node. At this point, the script freezes and remains unresponsive until the affected server starts responding to icmp.
We suspect that the timeout settings are not being applied correctly during the initial connection attempt to a node retrieved from cluster nodes.
Expected Behaviour
If a server retrieved from cluster nodes is unreachable, the default timeout settings should apply, preventing the script from hanging.
Actual Behaviour
If a server retrieved from cluster nodes is unreachable, the script hangs indefinitely when attempting to retrieve data from this node.
We are working with a Redis cluster consisting of multiple shards, with 3 nodes per shard. When a node unexpectedly restarts (due to hardware reset, for example), the first
get
call to the affected node throws an exception as expected. Upon encountering this exception, we reconnect to the cluster, however, initially, the cluster nodes do not report the affected node as failed.The new cluster connection remains functional until the point where it tries to retrieve an element from the affected shard and establishes a connection to the impaired node. At this point, the script freezes and remains unresponsive until the affected server starts responding to icmp.
We suspect that the timeout settings are not being applied correctly during the initial connection attempt to a node retrieved from
cluster nodes
.Expected Behaviour
If a server retrieved from
cluster nodes
is unreachable, the default timeout settings should apply, preventing the script from hanging.Actual Behaviour
If a server retrieved from
cluster nodes
is unreachable, the script hangs indefinitely when attempting to retrieve data from this node.If we look at netstat
Environment
Steps to Reproduce
get
call to the affected node and observe the thrown exception.Checklist
develop
branch.Update / additional notes
We are using
and the failing node is a slave
The text was updated successfully, but these errors were encountered: