Details
-
Bug
-
Resolution: Duplicate
-
Major
-
Trinity
-
7.6.0-1507
Centos 7 64bit
-
Untriaged
-
Centos 64-bit
-
-
0
-
No
Description
Steps:
- 5 node KV cluster with n2n encryption level=all
- 1 magma bucket with replica=3
- Set auto-failover timeout=1
- Induce failure (stop_memcached using SIGSTOP) on '172.23.110.65'
- Wait for auto-failover to happen
Observations:
From test POV, we are inducing the failures at "21:40:22.944" and the ns_server detects the node is down immediately,
[ns_server:debug,2023-09-14T21:40:22.944-07:00,ns_1@172.23.110.64:<0.15702.0>:auto_failover:log_down_nodes_reason:403]Node 'ns_1@172.23.110.65' is considered down. Reason:"The data service did not respond. Either none of the buckets have warmed up or there is an issue with the data service. "
|
But, the nserver is taking 5 seconds to trigger the auto-failover here,
[ns_server:debug,2023-09-14T21:40:27.946-07:00,ns_1@172.23.110.64:<0.15700.0>:failover:start:44]Starting failover with Nodes = ['ns_1@172.23.110.65'], Options = #{allow_unsafe =>...
|
TAF test:
failover.concurrent_failovers.ConcurrentFailoverTests:
|
test_concurrent_failover,nodes_init=5,services_init=kv-kv-kv-kv-kv,replicas=3,maxCount=1,timeout=1,failover_order=kv,failover_method=stop_memcached,bucket_spec=single_bucket.default
|
Attachments
Issue Links
- duplicates
-
MB-58636 Janitor cannot be cancelled if memcached is unresponsive in query_vbuckets_loop
-
- Closed
-