Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-60166

Unexpected data node failover

    XMLWordPrintable

Details

    • Bug
    • Resolution: Not a Bug
    • Major
    • 7.2.4
    • 7.2.4
    • ns_server
    • Enterprise Edition 7.2.4 build 7059

    Description

      Steps:
      1. create a 9 node cluster

      +----------------+----------+
      | Node           | Services | 
      +----------------+----------+
      | 172.23.106.179 | n1ql     |
      | 172.23.107.65  | kv       |
      | 172.23.216.218 | index    |
      | 172.23.105.255 | index    |
      | 172.23.217.110 | kv       | 
      | 172.23.123.44  | kv       |
      | 172.23.107.26  | kv       |
      | 172.23.97.210  | kv       |
      | 172.23.109.76  | n1ql     | 
      +----------------+----------+

      2. create a couchstore bucket "bucket-0" with replica =2 

      menelaus:info,2023-12-14T04:42:30.702-08:00,ns_1@172.23.97.210:<0.7751.0>:menelaus_web_buckets:do_bucket_create:645]Created bucket "bucket-0" of type: couchbase
      [{num_replicas,2},
       {replica_index,true},
       {ram_quota,3149922304},
       {durability_min_level,none},
       {autocompaction,false},
       {purge_interval,undefined},
       {flush_enabled,false},
       {num_threads,3},
       {eviction_policy,value_only},
       {conflict_resolution_type,seqno},
       {storage_mode,couchstore},
       {max_ttl,0},
       {compression_mode,passive}]
      [ns_server:debug,2023-12-14T04:42:30.702-08:00,ns_1@172.23.97.210:chronicle_kv_log<0.1059.0>:chronicle_kv_log:log:59]update (key: {bucket,"bucket-0",collections}, rev: {<<"0891e3c0852f6003ec0720fda89cc4e6">>,
                                                          81})
      [{uid,0},
       {next_uid,1},
       {next_scope_uid,8},
      

      3. enable autofailover with maxcount = 5 and timeout = 60
      4. inducing failover to 4 nodes as follows , while rebalancing out the node 
      172.23.123.44 

      +----------------+----------+-------------+----------------+
      | Node           | Services | Node status | Failover type  |
      +----------------+----------+-------------+----------------+
      | 172.23.106.179 | n1ql     | active      | stop_couchbase |
      | 172.23.123.44  | kv       | active      | stop_memcached |
      | 172.23.107.26  | kv       | active      | stop_memcached |
      | 172.23.107.65  | kv       | active      | stop_memcached |
      +----------------+----------+-------------+----------------+

      Observation:
      rebalance exits then 3 nodes gets failed over 
      this is intermittent issue.

      [ns_server:debug,2023-12-14T04:44:05.766-08:00,ns_1@172.23.217.110:chronicle_kv_log<0.3445.0>:chronicle_kv_log:log:59]update (key: counters, rev: {<<"0891e3c0852f6003ec0720fda89cc4e6">>,218})
      [{failover_start,{1702557845,1}},
       {rebalance_fail,{1702557786,1}},
       {rebalance_start,{1702557768,2}},
       {rebalance_success,{1702557736,1}}]

       [ns_server:debug,2023-12-14T04:44:06.228-08:00,ns_1@172.23.217.110:chronicle_kv_log<0.3445.0>:chronicle_kv_log:log:59]update (key: {service_map,n1ql}, rev: {<<"0891e3c0852f6003ec0720fda89cc4e6">>,
                                             222})
      ['ns_1@172.23.109.76']
      [ns_server:debug,2023-12-14T04:44:06.229-08:00,ns_1@172.23.217.110:chronicle_kv_log<0.3445.0>:chronicle_kv_log:log:59]update (key: {service_failover_pending,n1ql}, rev: {<<"0891e3c0852f6003ec0720fda89cc4e6">>,
                                                          222})

      auto_failover_cfg ->
      [{'_vclock',[{<<"be3c0ecdc5653b8e9db7cc578382ae32">>,{4,63869777046}}]},
       {enabled,true},
       {timeout,60},
       {count,3},
       {max_count,5},
       {failed_over_server_groups,[]},
       {failover_preserve_durability_majority,false},
       

       

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            pulkit.matta Pulkit Matta
            pulkit.matta Pulkit Matta
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty