Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Won't Fix
-
6.0.0
-
Enterprise Edition 6.0.0 build 1673
-
Untriaged
-
Centos 64-bit
-
-
No
-
CX Sprint 122
Description
9 Node cluster, 6 KV and 3 CBAS
CentOS7, 8 core VM's
Note: I am assuming Analytics rebalance is struck based on rebalance % displayed on UI. It shows 99% for analytics and 100% for data nodes. Refer attached screenshot
Hi Tanzeem Ahmed,
I looked at the logs and the rebalance was actually progressing by extremely slowly due to IO congestion. When we were diagnosing the issue before the test was declared as failure, the rebalance was at the step of rebalancing the data. However, if you check the attached logs, you will see that the rebalance moved to the next step of creating the secondary indexes.
I believe the IO congestion is caused by the number of partitions on each analytics node. In this setup, each node has 8 partitions. If those VMs have spinning disks, I recommend reducing the number of partitions to 2. Similar issues were reported before and reducing the number partitions per node helped. Another option that I don't recommend would be to increase the timeout before declaring the test as a failure.