Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-56763

[System test upgrade] :- Online upgrade from 7.1.4-3631 -> 7.2.0-5322 using failover/recovery strategy continuously fails with "Rebalance exited with reason {mover_crashed, {unexpected_exit, {'EXIT',<0.26396.79>, {{wait_seqno_persisted_failed,"bucket8"""

    XMLWordPrintable

Details

    • Untriaged
    • Centos 64-bit
    • 0
    • No

    Description

      Steps to Repro
      1. Run a longevity test on 7.1.4 for 4 days.

      ./sequoia -client 172.23.104.27:2375 -provider file:centos_pine.yml -test tests/integration/neo/test_neo.yml -scope tests/integration/neo/scope_neo_magma.yml -scale 3 -repeat 0 -log_level 0 -version 7.1.4-3601 -skip_setup=false -skip_test=false -skip_teardown=true -skip_cleanup=false -continue=false -collect_on_error=false -stop_on_error=false -duration=604800 -show_topology=true
      

      2. Upgrading to 7.2.0-5322 using online upgrade with failover/recovery strategy.

      Couple of rebalances went perfectly fine. However on the 3rd failover/recovery rebalance it continuously fails while rebalancing bucket8.

      172.23.120.75 4:29:19 AMĀ 5 May, 2023

      Starting rebalance, KeepNodes = ['ns_1@172.23.120.73','ns_1@172.23.120.74', 'ns_1@172.23.120.75','ns_1@172.23.120.77', 'ns_1@172.23.120.81','ns_1@172.23.120.86', 'ns_1@172.23.121.77','ns_1@172.23.123.26', 'ns_1@172.23.123.31','ns_1@172.23.123.32', 'ns_1@172.23.123.33','ns_1@172.23.96.122', 'ns_1@172.23.96.14','ns_1@172.23.96.243', 'ns_1@172.23.96.254','ns_1@172.23.96.48', 'ns_1@172.23.97.105','ns_1@172.23.97.110', 'ns_1@172.23.97.112','ns_1@172.23.97.148', 'ns_1@172.23.97.241','ns_1@172.23.97.74'], EjectNodes = [], Failed over and being ejected nodes = []; no delta recovery nodes; Operation Id = 319cc39bd0fb403f4ecf22a56755358f

      172.23.120.75 4:29:25 AM 5 May, 2023

      Rebalance exited with reason {mover_crashed,
      {unexpected_exit,
      {'EXIT',<0.14753.79>,
      {{{{{child_interrupted,
      {'EXIT',<21850.7567.2960>,socket_closed}},
      [{dcp_replicator,spawn_and_wait,1,
      [{file,"src/dcp_replicator.erl"},
      {line,358}]},
      {dcp_replicator,handle_call,3,
      [{file,"src/dcp_replicator.erl"},
      {line,146}]},
      {gen_server,try_handle_call,4,
      [{file,"gen_server.erl"},{line,721}]},
      {gen_server,handle_msg,6,
      [{file,"gen_server.erl"},{line,750}]},
      {proc_lib,init_p_do_apply,3,
      [{file,"proc_lib.erl"},{line,226}]}]},
      {gen_server,call,
      [<21850.8861.2960>,
      {setup_replication,
      [644,689,693,694,701,702,750,751,865,
      915,925,935,955,978,1002]},
      infinity]}},
      {gen_server,call,
      ['replication_manager-bucket8',
      {change_vbucket_replication,644,
      'ns_1@172.23.120.81'},
      infinity]}},
      {gen_server,call,
      [{'janitor_agent-bucket8',
      'ns_1@172.23.123.33'},
      {if_rebalance,<0.15886.79>,
      {wait_dcp_data_move,
      ['ns_1@172.23.120.73'],
      583}},
      infinity]}}}}}.
      Rebalance Operation Id = 319cc39bd0fb403f4ecf22a56755358f
      

      cbcollect_info attached.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            ashwin.govindarajulu Ashwin Govindarajulu
            Balakumaran.Gopal Balakumaran Gopal
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty