Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-29682

Rebalance exited with reason {child_died,bad_replicas}

    XMLWordPrintable

Details

    Description

      A separate ticket for rebalance failures observed in MB-29217 and MB-26791.

      Test scenario:

      • 9 nodes
      • 1 bucket, 1 replica, full eviction
      • 1B items (~1KB), 5-10% resident ratio
      • 15K ops/sec (90% read, 10% update), 10% cache miss ratio (before rebalance)
      • Swap rebalance of one node (172.23.96.108 -> 172.23.96.109)

      [user:error,2018-05-06T09:34:17.933-07:00,ns_1@172.23.96.100:<0.2270.0>:ns_orchestrator:do_log_rebalance_completion:1122]Rebalance exited with reason {child_died,bad_replicas}
      

      [user:info,2018-05-06T09:34:17.932-07:00,ns_1@172.23.96.100:<0.241.36>:ns_rebalancer:verify_replication:985]Bad replicators after rebalance:
      Missing = [{'ns_1@172.23.96.109','ns_1@172.23.96.107',1023}]
      

      Previously:

      2018-05-06T08:07:22.818569Z INFO (bucket-1) DCP (Consumer) eq_dcpq:replication:ns_1@172.23.96.109->ns_1@172.23.96.107:bucket-1 - Disconnecting because a message has not been received for 360s. lastMessageTime:361
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              dhaikney David Haikney (Inactive)
              pavelpaulau Pavel Paulau (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty