Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-59904

[Upgrade] Swap Rebalance of a KV node fails when upgrading the cluster from 7.2.0 to 7.6.0

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • 7.6.0
    • 7.6.0
    • couchbase-bucket
    • Initial version - 7.2.0-5325
      Upgrade version - 7.6.0-1871
      OS - Debian 10

    Description

      Steps:

      1. Install 7.2.0-5325 on 4 nodes and initialise a cluster with these services. Node 1 - KV, Node 2 - KV, Node 3 - index:n1ql, Node 4 - index:n1ql
        Set indexing memory quota = 3GB per node
      2. Create a Magma bucket with 1GB RAM per node and 1 replica.
      3. Load 5 million documents into the bucket.
      4. Upsert 20% of the data in the bucket.
      5. Create various types of indexes - primary indexes, secondary index with replicas and partitions on different fields of the document.
      6. Create a new scope 'myscope' and a collection 'mycoll' inside the new scope. Load 500,000 documents into the collection 'mycoll'. 
        Create new indexes on the documents in the new collection.
      7. Start the upgrade process to 7.6.0-1871. Iterative swap rebalance approach is followed.
      8. During the upgrade, update and read workload is performed on the cluster. Also, multiple N1QL queries are run in parallel. During the upgrade of each node, 3500 queries are executed which makes use of the indexes that were created prior to the upgrade.
      9. Upgrade of the first KV node was successful.
      10. But during the upgrade of the second KV node, swap rebalance exited with reason

      'Rebalance exited with reason {mover_crashed,{unexpected_exit,                              {\'EXIT\',<0.2533.3>,                             {{bulk_set_vbucket_state_failed,                                 [{\'ns_1@172.23.217.47\',                                  {\'EXIT\',                                   {{{badmatch,                                        {error,                                         {setup_replications_failed,                                          [{\'ns_1@172.23.217.52\',                                            {errors,[{134,501}]}}]}}},                                       [{janitor_agent,                                         handle_apply_vbucket_state,2,                                         [{file,"src/janitor_agent.erl"},                                          {line,1068}]},                                        {janitor_agent,                                         apply_vbucket_states_worker_loop,0,                                         [{file,"src/janitor_agent.erl"},                                          {line,1057}]},                                        {proc_lib,init_p,3,                                         [{file,"proc_lib.erl"},{line,225}]}]},                                      {gen_server,call,                                       [{\'janitor_agent-bucket-0\',                                         \'ns_1@172.23.217.47\'},                                        {if_rebalance,<0.617.3>,                                         {update_vbucket_state,501,replica,                                          undefined,\'ns_1@172.23.217.52\'}},                                        infinity]}}}}]},                                 [{janitor_agent,bulk_set_vbucket_state,4,                                   [{file,"src/janitor_agent.erl"},                                    {line,393}]},                                  {proc_lib,init_p,3,                                  [{file,"proc_lib.erl"},{line,225}]}]}}}}.Rebalance Operation Id = 4dfda98b6b727d79aae93fd0547d393e'} 

      Update and read workload was going on when this happened.
      172.23.217.47 was the node coming-in and 172.23.217.48 was the node going out.
      Cb-collect logs have been attached.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              vibhav.sp Vibhav S P
              vibhav.sp Vibhav S P
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty