Test Steps
- Deploy a GCP cluster consisting of 3 KV, 2 GSI and 2 Query nodes.
- Create a Magma bucket with 1 scope and 10 collections.
- Load 100 million docs in each of the 10 collections with document size equal to 1kB.
- Create 4 Indexes per collection covering all different types - array Indexes, Indexes having partitions etc and wait for Index building to complete.
- Modify compute for GSI and Data nodes triggering set of rebalance operations.
Rebalance Start
2023-08-04T14:15:46.835Z, ns_orchestrator:0:info:message(ns_1@svc-d-node-013.cpjvl0qdtbgi0qio.sandbox.nonprod-project-avengers.com) - Starting rebalance, KeepNodes = ['ns_1@svc-d-node-013.cpjvl0qdtbgi0qio.sandbox.nonprod-project-avengers.com',
|
'ns_1@svc-d-node-014.cpjvl0qdtbgi0qio.sandbox.nonprod-project-avengers.com',
|
'ns_1@svc-d-node-015.cpjvl0qdtbgi0qio.sandbox.nonprod-project-avengers.com',
|
'ns_1@svc-i-node-004.cpjvl0qdtbgi0qio.sandbox.nonprod-project-avengers.com',
|
'ns_1@svc-i-node-016.cpjvl0qdtbgi0qio.sandbox.nonprod-project-avengers.com',
|
'ns_1@svc-q-node-006.cpjvl0qdtbgi0qio.sandbox.nonprod-project-avengers.com',
|
'ns_1@svc-q-node-007.cpjvl0qdtbgi0qio.sandbox.nonprod-project-avengers.com'], EjectNodes = ['ns_1@svc-i-node-005.cpjvl0qdtbgi0qio.sandbox.nonprod-project-avengers.com'], Failed over and being ejected nodes = []; no delta recovery nodes; Operation Id = fc3a6afe5699949bb90c6d7468585103
|
Rebalance Failure
GSI rebalance is failing continously with indexer rebalance failure - index build is in progress for indexes error.
grep "indexer rebalance failure - index build is in progress for indexes" diag.log | wc -l
|
438
|
2023-08-04T14:15:49.260Z, ns_orchestrator:0:critical:message(ns_1@svc-d-node-013.cpjvl0qdtbgi0qio.sandbox.nonprod-project-avengers.com) - Rebalance exited with reason {service_rebalance_failed,index,
|
{worker_died,
|
{'EXIT',<0.10069.57>,
|
{rebalance_failed,
|
{service_error,
|
<<"indexer rebalance failure - index build is in progress for indexes: [default:idx13_TvVN12H default:idx3_0GMdpAB default:idx13_TvVN12H default:idx4_5aRzp default:idx2_c37HMQwe default:idx2_c37HMQwe].">>}}}}}.
|
Rebalance Operation Id = fc3a6afe5699949bb90c6d7468585103
|