Details
-
Bug
-
Resolution: Fixed
-
Critical
-
7.2.0
-
7.2.0 build 5275
-
Untriaged
-
Centos 64-bit
-
0
-
Yes
-
Analytics Sprint 16
Description
Steps to Repro
./sequoia -client 172.23.104.27:2375 -provider file:centos_pine.yml -test tests/integration/7.2/test_7.2.yml -scope tests/integration/7.2/scope_7.2_magma.yml -scale 3 -repeat 0 -log_level 0 -version 7.2.0-5275 -skip_setup=false -skip_test=false -skip_teardown=true -skip_cleanup=false -continue=false -collect_on_error=false -stop_on_error=false -duration=604800 -show_topology=true
|
Started longevity test on 7.2.0-5275. After 5 hours when the test reached swap rebalance of analytics node it failed. Similar issue was seen in MB-56060. It was mentioned there issue is fixed so that we won't see it in 7.2. Error message seems similar to it.
172.23.108.103 3:25:57 AM 28 Mar, 2023
Starting rebalance, KeepNodes = ['ns_1@172.23.104.155','ns_1@172.23.104.157',
|
'ns_1@172.23.104.5','ns_1@172.23.104.67',
|
'ns_1@172.23.104.69','ns_1@172.23.104.70',
|
'ns_1@172.23.105.107','ns_1@172.23.105.111',
|
'ns_1@172.23.106.100','ns_1@172.23.106.188',
|
'ns_1@172.23.108.103','ns_1@172.23.120.107',
|
'ns_1@172.23.120.245','ns_1@172.23.121.117',
|
'ns_1@172.23.123.28','ns_1@172.23.96.148',
|
'ns_1@172.23.96.192','ns_1@172.23.96.252',
|
'ns_1@172.23.96.253','ns_1@172.23.97.119',
|
'ns_1@172.23.97.121','ns_1@172.23.97.122',
|
'ns_1@172.23.97.239','ns_1@172.23.99.11',
|
'ns_1@172.23.99.20','ns_1@172.23.99.21',
|
'ns_1@172.23.99.25'], EjectNodes = ['ns_1@172.23.104.137'], Failed over and being ejected nodes = []; no delta recovery nodes; Operation Id = a5cacdd0d6356aca105ea2f3e8e125d9
|
172.23.104.137 3:27:27 AM 28 Mar, 2023
Analytics Service unable to successfully rebalance ee5d415339b47cd7ceacde0af4e59251 due to 'CBAS0001: Analytics collections in different partitions have different DCP states. Mutations needed to catch up = 13216. User action: Try again later'; see analytics_info.log for details
|
172.23.108.103 3:27:28 AM 28 Mar, 2023
Rebalance exited with reason {service_rebalance_failed,cbas,
|
{worker_died,
|
{'EXIT',<0.3409.175>,
|
{rebalance_failed,
|
{service_error,
|
<<"Rebalance ee5d415339b47cd7ceacde0af4e59251 failed: CBAS0001: Analytics collections in different partitions have different DCP states. Mutations needed to catch up = 13216. User action: Try again later">>}}}}}.
|
Rebalance Operation Id = a5cacdd0d6356aca105ea2f3e8e125d9
|
cbcollect_info attached. This was not seen on the run we had on 7.2.0-5263.