Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-45232

Failed to add node back into cluster after node_failover and add back case

    XMLWordPrintable

Details

    Description

      Build: 7.0.0-4735

      Scenario:

      • 4 node cluster, 2 Couchbase buckets

        +---------------+----------+-------+------------+------------+------------------------+-------------------+
        | Node          | Services |  CPU  | Mem_total  | Mem_free   | Swap_mem_used          | Active / Replica  |
        +---------------+----------+-------+------------+------------+------------------------+-------------------+
        | 172.23.100.13 | kv       | 58.53 | 4201684992 | 3464732672 | 91488256 / 3758092288  | 3781 / 7529       |
        | 172.23.100.12 | kv       | 58.72 | 4201689088 | 3418853376 | 110215168 / 3758092288 | 3781 / 7800       |
        | 172.23.100.18 | kv       | 59.69 | 4201684992 | 3461459968 | 102236160 / 3758092288 | 3783 / 7491       |
        | 172.23.100.19 | kv       | 59.50 | 4201684992 | 3445219328 | 115343360 / 3758092288 | 3785 / 7438       |
        +---------------+----------+-------+------------+------------+------------------------+-------------------++--------------+-----------+----------+------------+-----+-------+------------+-----------+-----------+
        | Bucket       | Type      | Replicas | Durability | TTL | Items | RAM Quota  | RAM Used  | Disk Used |
        +--------------+-----------+----------+------------+-----+-------+------------+-----------+-----------+
        | couchstore.0 | couchbase | 2        | none       | 0   | 7235  | 2986344448 | 98909336  | 424106110 |
        | couchstore.1 | couchbase | 2        | none       | 0   | 7895  | 2986344448 | 101506488 | 513393138 |
        +--------------+-----------+----------+------------+-----+-------+------------+-----------+-----------+
        

      • Load initial 10K docs into both the buckets through Transaction
      • Failover 172.23.100.12 node (Id 02b2ab7412476b02141e1cff571d37ad success)
      • Failover 172.23.100.18 node (Id 6e28ce7a0762769cd03679e696ebd2de success)
      • Eject .12 and .18 using rebalance-out operation (Id b97763b901d7c36e43737d51d7533a7f)
      • Try adding back .12 node

      Observation:

      Adding back node 172.23.100.12 failed with reason,

      Failed to connect to http://172.23.100.12:8091. Could not connect to "172.23.100.12" on port 8091. This could be due to an incorrect host/port combination or a firewall in place between the servers.

      172.23.100.13 - ns_server.info.log

      [user:info,2021-03-24T07:03:33.272-07:00,ns_1@172.23.100.13:<0.29517.0>:ns_cluster:add_node_to_group:89]Failed to add node 172.23.100.12:8091 to cluster. Failed to connect to http://172.23.100.12:8091. Could not connect to "172.23.100.12" on port 8091.  This could be due to an incorrect host/port combination or a firewall in place between the servers.

      172.23.100.12 - ns_server.inf.o.log

      [ns_server:info,2021-03-24T07:03:33.022-07:00,ns_1@172.23.100.12:ns_couchdb_port<0.22809.1>:ns_port_server:log:224]ns_couchdb<0.22809.1>: 2021-03-24 07:03:32.818687
      ns_couchdb<0.22809.1>:     args: ['ns_1@172.23.100.12']
      ns_couchdb<0.22809.1>:     format: "** Connection attempt from disallowed node ~w ** ~n"
      ns_couchdb<0.22809.1>:     label: {error_logger,error_msg}[ns_server:warn,2021-03-24T07:03:33.023-07:00,ns_1@172.23.100.12:memcached_refresh<0.22736.1>:ns_memcached:connect:1207]Unable to connect: {error,{badmatch,[{inet,{error,econnrefused}}]}}.
      [ns_server:info,2021-03-24T07:03:33.228-07:00,ns_1@172.23.100.12:ns_couchdb_port<0.22809.1>:ns_port_server:log:224]ns_couchdb<0.22809.1>: 2021-03-24 07:03:33.027905
      ns_couchdb<0.22809.1>:     args: ['ns_1@172.23.100.12']
      ns_couchdb<0.22809.1>:     format: "** Connection attempt from disallowed node ~w ** ~n"
      ns_couchdb<0.22809.1>:     label: {error_logger,error_msg}
      

      TAF test:

      rebalance_new.swaprebalancetests.SwapRebalanceFailedTests:
          test_add_back_failed_node,doc_size=512,standard_buckets=2,transaction_timeout=150,swap-orchestrator=True,num-swap=2,atomicity=True,replicas=2,durability=PERSIST_TO_MAJORITY,nodes_init=4,num_items=10000

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              ashwin.govindarajulu Ashwin Govindarajulu
              ashwin.govindarajulu Ashwin Govindarajulu
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty