Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Duplicate
Priority: Major
Fix Version/s: 7.6.0
Affects Version/s: 7.6.0
Component/s: couchbase-bucket
Labels:
- rebalance-failed
- upgrade
Environment:
Initial version - 7.2.0-5325
Upgrade version - 7.6.0-1871
OS - Debian 10

Triage:
Untriaged
Operating System:
Linux x86_64
Link to Log File, atop/blg, CBCollectInfo, Core dump:

Hide
https://cb-engineering.s3.amazonaws.com/kv_index_reb_failure/collectinfo-2023-11-30T084110-ns_1%40172.23.217.47.zip
https://cb-engineering.s3.amazonaws.com/kv_index_reb_failure/collectinfo-2023-11-30T084110-ns_1%40172.23.217.48.zip
https://cb-engineering.s3.amazonaws.com/kv_index_reb_failure/collectinfo-2023-11-30T084110-ns_1%40172.23.217.50.zip
https://cb-engineering.s3.amazonaws.com/kv_index_reb_failure/collectinfo-2023-11-30T084110-ns_1%40172.23.217.51.zip
https://cb-engineering.s3.amazonaws.com/kv_index_reb_failure/collectinfo-2023-11-30T084110-ns_1%40172.23.217.52.zip

Show
https://cb-engineering.s3.amazonaws.com/kv_index_reb_failure/collectinfo-2023-11-30T084110-ns_1%40172.23.217.47.zip https://cb-engineering.s3.amazonaws.com/kv_index_reb_failure/collectinfo-2023-11-30T084110-ns_1%40172.23.217.48.zip https://cb-engineering.s3.amazonaws.com/kv_index_reb_failure/collectinfo-2023-11-30T084110-ns_1%40172.23.217.50.zip https://cb-engineering.s3.amazonaws.com/kv_index_reb_failure/collectinfo-2023-11-30T084110-ns_1%40172.23.217.51.zip https://cb-engineering.s3.amazonaws.com/kv_index_reb_failure/collectinfo-2023-11-30T084110-ns_1%40172.23.217.52.zip
Story Points:
0
Is this a Regression?:
Unknown

Description

Steps:

Install 7.2.0-5325 on 4 nodes and initialise a cluster with these services. Node 1 - KV, Node 2 - KV, Node 3 - index:n1ql, Node 4 - index:n1ql
Set indexing memory quota = 3GB per node
Create a Magma bucket with 1GB RAM per node and 1 replica.
Load 5 million documents into the bucket.
Upsert 20% of the data in the bucket.
Create various types of indexes - primary indexes, secondary index with replicas and partitions on different fields of the document.
Create a new scope 'myscope' and a collection 'mycoll' inside the new scope. Load 500,000 documents into the collection 'mycoll'.
Create new indexes on the documents in the new collection.
Start the upgrade process to 7.6.0-1871. Iterative swap rebalance approach is followed.
During the upgrade, update and read workload is performed on the cluster. Also, multiple N1QL queries are run in parallel. During the upgrade of each node, 3500 queries are executed which makes use of the indexes that were created prior to the upgrade.
Upgrade of the first KV node was successful.
But during the upgrade of the second KV node, swap rebalance exited with reason

'Rebalance exited with reason {mover_crashed,{unexpected_exit,                              {\'EXIT\',<0.2533.3>,                             {{bulk_set_vbucket_state_failed,                                 [{\'ns_1@172.23.217.47\',                                  {\'EXIT\',                                   {{{badmatch,                                        {error,                                         {setup_replications_failed,                                          [{\'ns_1@172.23.217.52\',                                            {errors,[{134,501}]}}]}}},                                       [{janitor_agent,                                         handle_apply_vbucket_state,2,                                         [{file,"src/janitor_agent.erl"},                                          {line,1068}]},                                        {janitor_agent,                                         apply_vbucket_states_worker_loop,0,                                         [{file,"src/janitor_agent.erl"},                                          {line,1057}]},                                        {proc_lib,init_p,3,                                         [{file,"proc_lib.erl"},{line,225}]}]},                                      {gen_server,call,                                       [{\'janitor_agent-bucket-0\',                                         \'ns_1@172.23.217.47\'},                                        {if_rebalance,<0.617.3>,                                         {update_vbucket_state,501,replica,                                          undefined,\'ns_1@172.23.217.52\'}},                                        infinity]}}}}]},                                 [{janitor_agent,bulk_set_vbucket_state,4,                                   [{file,"src/janitor_agent.erl"},                                    {line,393}]},                                  {proc_lib,init_p,3,                                  [{file,"proc_lib.erl"},{line,225}]}]}}}}.Rebalance Operation Id = 4dfda98b6b727d79aae93fd0547d393e'}

Update and read workload was going on when this happened.
172.23.217.47 was the node coming-in and 172.23.217.48 was the node going out.
Cb-collect logs have been attached.

Attachments

Issue Links

duplicates

MB-59826 [Magma] - Rebalance out with 1 DGM bucket fails with "'Rebalance exited with reason {mover_crashed"

Closed

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Vibhav S P

Reporter:: Vibhav S P

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 30/Nov/23 1:25 AM

Updated:: 30/Nov/23 10:29 PM

Resolved:: 30/Nov/23 8:41 AM

Gerrit Reviews

There are no open Gerrit changes

[Upgrade] Swap Rebalance of a KV node fails when upgrading the cluster from 7.2.0 to 7.6.0

Details

Description

Attachments

Issue Links

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty