Details
-
Bug
-
Resolution: Duplicate
-
Major
-
7.6.0
-
7.6.0-1507
Centos 7 64bit
-
Untriaged
-
Centos 64-bit
-
-
0
-
No
Description
Steps:
- 5 node cluster with n2n encryption level=all
- Set auto-failover timeout=1
- Induce failure stop_memcached on node '172.23.110.65' (kill -STOP)
- Wait for auto failover to happen
Observations:
As per the test, the stop_memcached was triggered at '21:40:22,424' and ns_server detected the node_down immediately,
[ns_server:debug,2023-09-14T21:40:22.944-07:00,ns_1@172.23.110.64:<0.15702.0>:auto_failover:log_down_nodes_reason:403]Node 'ns_1@172.23.110.65' is considered down. Reason:"The data service did not respond. Either none of the buckets have warmed up or there is an issue with the data service. "
|
But the actual failover was initiated only after ~5 seconds
[ns_server:debug,2023-09-14T21:40:27.946-07:00,ns_1@172.23.110.64:<0.15700.0>:failover:start:44]Starting failover with Nodes = ['ns_1@172.23.110.65'], Options = #{allow_unsafe => ...
|
TAF test:
failover.concurrent_failovers.ConcurrentFailoverTests:
|
test_concurrent_failover,nodes_init=5,services_init=kv-kv-kv-kv-kv,replicas=3,maxCount=1,timeout=1,failover_order=kv,failover_method=stop_memcached,bucket_spec=single_bucket.default
|
Attachments
Issue Links
- duplicates
-
MB-58636 Janitor cannot be cancelled if memcached is unresponsive in query_vbuckets_loop
- Closed