Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-8008

[system test]rebalance failure due to bad replicators after rebalance

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Critical
    • 2.1.0
    • 2.0.1
    • ns_server
    • Security Level: Public
    • windows server 2008 R2 64bit ssd vms

    Description

      Install couchbase server 2.0.1-185 on 9 windows ssd vms.
      Create a cluster with 7 nodes
      10.3.121.173
      10.3.121.169
      10.3.121.171
      10.3.3.214
      10.3.121.47
      10.3.3.180
      10.3.3.181

      Create 2 buckets:
      "buckets" : {"default" :

      {"quota": "3500 MB", "replicas": "1", "replica_index": "enable"}

      ,
      "sasl": {"count": "1", "quota": "2000MB", "replicas": "1", "replica_index": "disable"}}

      Create 2 design docs:
      "ddocs" : {"create": [{"ddoc":"ddoc1", "view":"view1", "map":"function(doc,meta)

      {emit(doc.city,null);}", "bucket":"default"},
      {"ddoc":"ddoc1", "view":"view2", "map":"function(doc,meta){emit(doc.city, [doc.st, doc.email]);}", "bucket":"default"},
      {"ddoc":"ddoc2", "view":"view1", "map":"function(doc,meta){emit(doc.city,null);}

      ", "bucket":"saslbucket"},
      {"ddoc":"ddoc2", "view":"view2", "map":"function(doc,meta)

      {emit(doc.city, [doc.st, doc.email]);}

      ", "bucket":"saslbucket"}]}

      no xdcr created

      Load 20+ million items to both bucket until resident ratio on both bucket around 70%
      Access cluster in 3 hours with spec in this page and run query with 200~500 ops per second on both buckets
      5% expire - 5% delete - 5% add - 5% update , 80% gets - 3 hours access phase for default
      70% expire - 20% delete - 5% add - 5% gets - 3 hours access phase for saslbucket

      Then the test go through a "swap rebalance the orchestrator node" phase and a "rebalance in one node" phase
      In "rebalance out one node" phase, rebalance failed with error:

      Started rebalancing bucket saslbucket ns_rebalancer000 ns_1@10.3.121.171 03:42:54 - Sat Mar 30, 2013
      Starting rebalance, KeepNodes = ['ns_1@10.3.3.181','ns_1@10.3.121.243',
      'ns_1@10.3.121.47','ns_1@10.3.3.182',
      'ns_1@10.3.3.214','ns_1@10.3.121.171',
      'ns_1@10.3.121.169'], EjectNodes = ['ns_1@10.3.121.173']
      ns_orchestrator004 ns_1@10.3.121.171 03:42:54 - Sat Mar 30, 2013

      Bad replicators after rebalance:
      Missing = [

      {'ns_1@10.3.121.171','ns_1@10.3.3.181',173}

      ,

      {'ns_1@10.3.121.171','ns_1@10.3.3.181',174}

      ,

      {'ns_1@10.3.121.171','ns_1@10.3.3.181',175}

      ,

      {'ns_1@10.3.121.171','ns_1@10.3.3.181',176}

      ,

      {'ns_1@10.3.121.171','ns_1@10.3.3.181',177}

      ,

      {'ns_1@10.3.121.171','ns_1@10.3.3.181',178}

      ,

      {'ns_1@10.3.121.171','ns_1@10.3.3.181',246}

      ,

      {'ns_1@10.3.121.171','ns_1@10.3.3.181',247}

      ,

      {'ns_1@10.3.121.171','ns_1@10.3.3.181',248}

      ,

      {'ns_1@10.3.121.171','ns_1@10.3.3.181',249}

      ,

      {'ns_1@10.3.121.171','ns_1@10.3.3.181',250}

      ,

      {'ns_1@10.3.121.171','ns_1@10.3.3.181',251}

      ,

      {'ns_1@10.3.121.171','ns_1@10.3.3.181',252}

      ,

      {'ns_1@10.3.121.171','ns_1@10.3.3.181',253}

      ,

      {'ns_1@10.3.121.171','ns_1@10.3.3.181',254}

      ,

      {'ns_1@10.3.121.171','ns_1@10.3.3.181',255}

      ,

      {'ns_1@10.3.121.171','ns_1@10.3.3.181',256}

      ,

      {'ns_1@10.3.121.171','ns_1@10.3.3.181',257}

      ,

      {'ns_1@10.3.121.171','ns_1@10.3.3.181',258}

      ,

      {'ns_1@10.3.121.171','ns_1@10.3.3.181',259}

      ,

      {'ns_1@10.3.121.171','ns_1@10.3.3.181',260}

      ,

      {'ns_1@10.3.121.171','ns_1@10.3.3.181',261}

      ,

      {'ns_1@10.3.121.171','ns_1@10.3.3.181',262}

      ,

      {'ns_1@10.3.121.171','ns_1@10.3.3.181',263}

      ]
      Extras = [] ns_rebalancer002 ns_1@10.3.121.171 06:18:36 - Sat Mar 30, 2013
      Rebalance exited with reason bad_replicas ns_orchestrator002 ns_1@10.3.121.171 06:18:36 - Sat Mar 30, 2013
      Shutting down bucket "saslbucket" on 'ns_1@10.3.121.173' for deletion ns_memcached002 ns_1@10.3.121.173 06:18:37 - Sat Mar 30, 2013

      Link to manifest file of this build http://builds.hq.northscale.net/latestbuilds/couchbase-server-community_x86_64_2.0.1-185-rel.setup.exe.manifest.xml

      Diags link https://s3.amazonaws.com/bugdb/jira/9ssd-win2.0.1-185-ns-diag-20130330.txt.zip

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            alkondratenko Aleksey Kondratenko (Inactive)
            Chisheng Chisheng Hong (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty