Details
-
Bug
-
Resolution: Duplicate
-
Major
-
7.6.0
-
Initial version - 7.2.0-5325
Upgrade version - 7.6.0-1871
OS - Debian 10
-
Untriaged
-
Linux x86_64
-
-
0
-
Unknown
Description
Steps:
- Install 7.2.0-5325 on 4 nodes and initialise a cluster with these services. Node 1 - KV, Node 2 - KV, Node 3 - index:n1ql, Node 4 - index:n1ql
Set indexing memory quota = 3GB per node - Create a Magma bucket with 1GB RAM per node and 1 replica.
- Load 5 million documents into the bucket.
- Upsert 20% of the data in the bucket.
- Create various types of indexes - primary indexes, secondary index with replicas and partitions on different fields of the document.
- Create a new scope 'myscope' and a collection 'mycoll' inside the new scope. Load 500,000 documents into the collection 'mycoll'.
Create new indexes on the documents in the new collection. - Start the upgrade process to 7.6.0-1871. Iterative swap rebalance approach is followed.
- During the upgrade, update and read workload is performed on the cluster. Also, multiple N1QL queries are run in parallel. During the upgrade of each node, 3500 queries are executed which makes use of the indexes that were created prior to the upgrade.
- Upgrade of the first KV node was successful.
- But during the upgrade of the second KV node, swap rebalance exited with reason
'Rebalance exited with reason {mover_crashed,{unexpected_exit, {\'EXIT\',<0.2533.3>, {{bulk_set_vbucket_state_failed, [{\'ns_1@172.23.217.47\', {\'EXIT\', {{{badmatch, {error, {setup_replications_failed, [{\'ns_1@172.23.217.52\', {errors,[{134,501}]}}]}}}, [{janitor_agent, handle_apply_vbucket_state,2, [{file,"src/janitor_agent.erl"}, {line,1068}]}, {janitor_agent, apply_vbucket_states_worker_loop,0, [{file,"src/janitor_agent.erl"}, {line,1057}]}, {proc_lib,init_p,3, [{file,"proc_lib.erl"},{line,225}]}]}, {gen_server,call, [{\'janitor_agent-bucket-0\', \'ns_1@172.23.217.47\'}, {if_rebalance,<0.617.3>, {update_vbucket_state,501,replica, undefined,\'ns_1@172.23.217.52\'}}, infinity]}}}}]}, [{janitor_agent,bulk_set_vbucket_state,4, [{file,"src/janitor_agent.erl"}, {line,393}]}, {proc_lib,init_p,3, [{file,"proc_lib.erl"},{line,225}]}]}}}}.Rebalance Operation Id = 4dfda98b6b727d79aae93fd0547d393e'} |
Update and read workload was going on when this happened.
172.23.217.47 was the node coming-in and 172.23.217.48 was the node going out.
Cb-collect logs have been attached.
Attachments
Issue Links
- duplicates
-
MB-59826 [Magma] - Rebalance out with 1 DGM bucket fails with "'Rebalance exited with reason {mover_crashed"
- Closed