Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-47087

Multi node graceful failover + rebalance out fails with "Rebalance exited with reason {badarg,[{io_lib,format,["Rebalancing bucket ~p with config ~p"

    XMLWordPrintable

Details

    • Untriaged
    • Centos 64-bit
    • 1
    • No

    Description

      Script to Repro

      guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/win10-bucket-ops.ini rerun=False,get-cbcollect-info=True,quota_percent=95,crash_warning=True,bucket_storage=magma,enable_dp=True,retry_get_process_num=200 -t bucket_collections.collections_rebalance.CollectionsRebalance.test_data_load_collections_with_graceful_failover_rebalance_out,nodes_init=5,nodes_failover=2,bucket_spec=multi_bucket.buckets_for_rebalance_tests,data_load_stage=after,scrape_interval=2,rebalance_moves_per_node=64,skip_validations=False,GROUP=P0_failover_and_rebalance_out'
      

      Steps to Repro
      1. Create a 5 node cluster
      2021-06-23 21:09:04,476 | test | INFO | pool-5-thread-6 | [table_view:display:72] Rebalance Overview
      ----------------------------------------------------------------------

      Nodes Services Version CPU Status

      ----------------------------------------------------------------------

      172.23.98.196 kv 7.1.0-1036-enterprise 5.39450466347 Cluster node
      172.23.98.195 None     <--- IN —
      172.23.121.10 None     <--- IN —
      172.23.104.186 None     <--- IN —
      172.23.120.201 None     <--- IN —

      ----------------------------------------------------------------------

      2. Create bucket/scopes/collections/data
      2021-06-23 21:14:21,836 | test | INFO | MainThread | [table_view:display:72] Bucket statistics
      -----------------------------------------------------------------------------------------------------------------------------------------------------------------

      Bucket Type Replicas Durability TTL Items RAM Quota RAM Used Disk Used

      -----------------------------------------------------------------------------------------------------------------------------------------------------------------

      4yZIQSEJJD6IzNnFF4kWkPgVRxWd0tC-1DBmRsF05LD1MVBbaPNwgg2vNjO8SCRd_syOC9QCrPVlNpP_x-15-414000 couchbase 2 none 0 3000000 10485760000 2613283152 8185163251

      -----------------------------------------------------------------------------------------------------------------------------------------------------------------

      3. Change few setting and graceful failover 2 nodes(172.23.104.186 and 172.23.120.201)

      2021-06-23 21:14:29,920 | test  | INFO    | MainThread | [collections_rebalance:setUp:59] Changing scrape interval to 2
      2021-06-23 21:14:32,463 | test  | INFO    | MainThread | [cluster_ready_functions:set_rebalance_moves_per_nodes:129] Changed Rebalance settings: {u'rebalanceMovesPerNode': 64}
      2021-06-23 21:14:32,464 | test  | INFO    | MainThread | [collections_rebalance:load_collections_with_rebalance:932] Doing collection data load after graceful_failover_rebalance_out
      2021-06-23 21:14:32,466 | test  | INFO    | MainThread | [collections_rebalance:rebalance_operation:390] Starting rebalance operation of type : graceful_failover_rebalance_out
      2021-06-23 21:14:32,467 | test  | INFO    | MainThread | [collections_rebalance:rebalance_operation:602] failing over nodes [ip:172.23.104.186 port:8091 ssh_username:root, ip:172.23.120.201 port:8091 ssh_username:root]
      

      4. Wait for graceful failover to complete

      2021-06-23 21:21:03,559 | test  | WARNING | MainThread | [rest_client:get_nodes:1756] 172.23.104.186 - Node not part of cluster inactiveFailed
      2021-06-23 21:21:03,561 | test  | WARNING | MainThread | [rest_client:get_nodes:1756] 172.23.120.201 - Node not part of cluster inactiveFailed
      

      5. Rebalance out the node
      2021-06-23 21:21:59,946 | test | INFO | pool-5-thread-11 | [table_view:display:72] Rebalance Overview
      -----------------------------------------------------------------------

      Nodes Services Version CPU Status

      -----------------------------------------------------------------------

      172.23.98.196 kv 7.1.0-1036-enterprise 10.5732484076 Cluster node
      172.23.98.195 kv 7.1.0-1036-enterprise 24.7787610619 Cluster node
      172.23.104.186 kv 7.1.0-1036-enterprise 1.00755667506 — OUT --->
      172.23.120.201 kv 7.1.0-1036-enterprise 0.752823086575 — OUT --->
      172.23.121.10 kv 7.1.0-1036-enterprise 12.2474747475 Cluster node

      -----------------------------------------------------------------------

      6. Rebalance fails as in rebalance_failure.txt.

      cbcollect_info attached.

      Attachments

        1. bucket_conf
          68 kB
        2. MB-47087_repro.log
          106 kB
        3. rebalance_failure.txt
          65 kB
        4. test.log
          108 kB

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            Balakumaran.Gopal Balakumaran Gopal created issue -
            Balakumaran.Gopal Balakumaran Gopal made changes -
            Field Original Value New Value
            Attachment test.log [ 147271 ]
            Balakumaran.Gopal Balakumaran Gopal made changes -
            Assignee Balakumaran Gopal [ balakumaran.gopal ] Daniel Owen [ owend ]
            ben.huddleston Ben Huddleston made changes -
            Rank Ranked lower
            ben.huddleston Ben Huddleston made changes -
            Assignee Daniel Owen [ owend ] Ben Huddleston [ ben.huddleston ]
            sumedh.basarkod Sumedh Basarkod made changes -
            Labels functional-test magma functional-test magma volume-test
            ben.huddleston Ben Huddleston made changes -
            Assignee Ben Huddleston [ ben.huddleston ] Meni Hillel [ JIRAUSER25407 ]
            Balakumaran.Gopal Balakumaran Gopal made changes -
            Attachment MB-47087_repro.log [ 150292 ]
            meni.hillel Meni Hillel (Inactive) made changes -
            Assignee Meni Hillel [ JIRAUSER25407 ] Steve Watanabe [ steve.watanabe ]
            steve.watanabe Steve Watanabe made changes -
            Component/s ns_server [ 10019 ]
            Component/s couchbase-bucket [ 10173 ]
            steve.watanabe Steve Watanabe made changes -
            Attachment bucket_conf [ 150611 ]
            meni.hillel Meni Hillel (Inactive) made changes -
            Assignee Steve Watanabe [ steve.watanabe ] Hareen Kancharla [ JIRAUSER25304 ]
            meni.hillel Meni Hillel (Inactive) made changes -
            Fix Version/s 7.0.1 [ 17104 ]
            Fix Version/s Neo [ 17615 ]
            meni.hillel Meni Hillel (Inactive) made changes -
            Priority Major [ 3 ] Critical [ 2 ]
            ben.huddleston Ben Huddleston made changes -
            Epic Link MB-30659 [ 88207 ]
            wayne Wayne Siu made changes -
            Link This issue blocks MB-47469 [ MB-47469 ]
            hareen.kancharla Hareen Kancharla made changes -
            Resolution Fixed [ 1 ]
            Status Open [ 1 ] Closed [ 6 ]
            meni.hillel Meni Hillel (Inactive) made changes -
            Summary [Magma] - Multi node graceful failover + rebalance out fails with "Rebalance exited with reason {badarg,[{io_lib,format,["Rebalancing bucket ~p with config ~p" Multi node graceful failover + rebalance out fails with "Rebalance exited with reason {badarg,[{io_lib,format,["Rebalancing bucket ~p with config ~p"
            lynn.straus Lynn Straus made changes -
            Fix Version/s 7.0.2 [ 18012 ]
            lynn.straus Lynn Straus made changes -
            Fix Version/s 7.0.1 [ 17104 ]
            lynn.straus Lynn Straus made changes -
            Fix Version/s 7.0.1 [ 17104 ]
            ritam.sharma Ritam Sharma made changes -
            Remote Link This issue links to "Page (Couchbase, Inc. Wiki)" [ 23035 ]
            wayne Wayne Siu made changes -
            Fix Version/s Neo [ 17615 ]
            wayne Wayne Siu made changes -
            Labels functional-test magma volume-test approved-for-7.0.1 functional-test magma volume-test

            People

              hareen.kancharla Hareen Kancharla
              Balakumaran.Gopal Balakumaran Gopal
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  PagerDuty