Details
-
Bug
-
Resolution: Duplicate
-
Critical
-
7.2.1
-
7.2.1-5819 on GCP
-
Untriaged
-
0
-
Unknown
Description
A 5 node cluster with the following config -
3 KV + 2 GSI/Query ( n2-standard-8 200 GB disk + n2-standard-8 450 GB disk)
After performing an initial data load, create indexes and then do an incremental data load to arrive at the required numbers (doc counts mentioned below). During this incremental data load, there is query workload as well. It seems that the Index Mutations Remaining has stayed the same (around 150 million) over the course of the last 15 or so hours (screenshot attached). The CPU is being utilised ( 85% + ) during this period, so the machines are not idle. Since the index mutations numbers did not reduce, about 2 or 3 hours into incremental workload, I killed all the data/query workload docker containers, but it still hasn't worked.
Another point to note is there are quite a few instances of auto-failover. Since the auto-failover interval is set to 10 seconds for cloud instances, nodes keep getting auto failed over and added back.
Not really sure if this has anything to do with why indexing seems to be stuck. But I have filed a separate ticket to look into why the nodes are getting auto failed over so many times.
cbcollect ->
https://cb-engineering.s3.amazonaws.com/SysTest_11Jul_Slow_Indexing/collectinfo-2023-07-12T040332-ns_1%40svc-d-node-001.r45djlf-eb-k-m7.sandbox.nonprod-project-avengers.com.zip |
https://cb-engineering.s3.amazonaws.com/SysTest_11Jul_Slow_Indexing/collectinfo-2023-07-12T040332-ns_1%40svc-d-node-002.r45djlf-eb-k-m7.sandbox.nonprod-project-avengers.com.zip |
https://cb-engineering.s3.amazonaws.com/SysTest_11Jul_Slow_Indexing/collectinfo-2023-07-12T040332-ns_1%40svc-d-node-003.r45djlf-eb-k-m7.sandbox.nonprod-project-avengers.com.zip |
https://cb-engineering.s3.amazonaws.com/SysTest_11Jul_Slow_Indexing/collectinfo-2023-07-12T040332-ns_1%40svc-qi-node-004.r45djlf-eb-k-m7.sandbox.nonprod-project-avengers.com.zip |
https://cb-engineering.s3.amazonaws.com/SysTest_11Jul_Slow_Indexing/collectinfo-2023-07-12T040332-ns_1%40svc-qi-node-005.r45djlf-eb-k-m7.sandbox.nonprod-project-avengers.com.zip |
Info for QE ->
Script used -> cmd/cp- cli/scenarios/system_tests/provisioned/provisioned_gsi_system_test.yaml
|
|
10 buckets with no of docs -7500000, 1800000, 5000000, 2000000 ,4000000, 600000, 60000, 600000, 6001, 9001. |
Number of indexes totals to 584. |
Wait for initial data load/ index creation/ incremental data load steps. |