Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-6955

[system test] Rebalance failed with reason "Partition x not in active nor passive set" in swap rebalance

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.0
    • Fix Version/s: 2.0
    • Component/s: ns_server
    • Security Level: Public
    • Labels:
    • Environment:
      centos 6.2 64bit build 2.0.0-1862

      Description

      Cluster information:

      • 8 centos 6.2 64bit server with 4 cores CPU
      • Each server has 32 GB RAM and 400 GB SSD disk.
      • SSD disk format ext4 on /data
      • Each server has its own SSD drive, no disk sharing with other server.
      • Create cluster with 6 nodes installed couchbase server 2.0.0-1862
      • Cluster has 2 buckets, default (12GB) and saslbucket (12GB).
      • Each bucket has one doc and 2 views for each doc (default d1 and saslbucket d11)
      • Enable consistent view on cluster (default)
      • Change value of erlang in couchbase-server from +A 16 to +S 128:128

      10.6.2.37
      10.6.2.38
      10.6.2.39
      10.6.2.40
      10.6.2.42
      10.6.2.43

      • Load 15 million items to each bucket. Each key has size from 512 bytes to 1024 bytes
      • Queries all 4 views from 2 docs
      • Mutate 15 million items with key size from 1500 to 1024 bytes
      • Do swap rebalance, add node 44, 45 and remove node 39, 40
      • Rebalance moves some items and hang in hours. Filed bug MB-6953
      • Try to stop rebalance but failed. Will re-open bug MB-6707.
      • Stop couchbase server at node 37. Node 37 down but rebalance does not stop
      • Go to node 38 and click stop rebalance. Rebalance stop. Then restart couchbase server on node 37.
      • When node 37 up in a while, rebalance cluster again. Rebalance failed in few minutes with error:

      Rebalance exited with reason {{{{badmatch,
      {error,

      {error, <<"Partition 854 not in active nor passive set">>}}},
      [{capi_set_view_manager,handle_call,3}, {gen_server,handle_msg,5}, {gen_server,init_it,6}, {proc_lib,init_p_do_apply,3}]},
      {gen_server,call,
      ['capi_set_view_manager-saslbucket', {wait_index_updated,854},
      infinity]}},
      {gen_server,call,
      [{'janitor_agent-saslbucket','ns_1@10.6.2.37'},
      {if_rebalance,<0.8171.289>,
      {wait_index_updated,513}},
      infinity]}}
      ns_orchestrator002
      ns_1@10.6.2.38
      22:52:21 - Wed Oct 17, 2012
      Server error during processing: ["web request failed", {path, "/pools/default/buckets/default/statsDirectory"}, {type,exit},
      {what,
      {noproc,
      {gen_server,call,
      ['capi_set_view_manager-default', {foreach_doc, #Fun<capi_ddoc_replication_srv.1.36030090>},
      infinity]}}},
      {trace,
      [{gen_server,call,3}, {capi_ddoc_replication_srv, foreach_live_ddoc_id,2}, {capi_ddoc_replication_srv,fetch_ddoc_ids, 1}, {menelaus_stats, couchbase_view_stats_descriptions,1}, {menelaus_stats,membase_stats_description, 1}, {menelaus_stats,serve_stats_directory,3}, {menelaus_web_buckets, checking_bucket_access,4}, {menelaus_web,loop,3}]}]
      menelaus_web019
      ns_1@10.6.2.45
      22:52:19 - Wed Oct 17, 2012
      <0.8771.289> exited with {{{{badmatch,
      {error,{error, <<"Partition 854 not in active nor passive set">>}

      }},
      [

      {capi_set_view_manager,handle_call,3}

      ,

      {gen_server,handle_msg,5}

      ,

      {gen_server,init_it,6}

      ,

      {proc_lib,init_p_do_apply,3}

      ]},
      {gen_server,call,
      ['capi_set_view_manager-saslbucket',

      {wait_index_updated,854}

      ,
      infinity]}},
      {gen_server,call,
      [

      {'janitor_agent-saslbucket','ns_1@10.6.2.37'}

      ,
      {if_rebalance,<0.8171.289>,
      {wait_index_updated,513}},
      infinity]}}
      ns_vbucket_mover000
      ns_1@10.6.2.38
      22:52:10 - Wed Oct 17, 2012

      • This bug is similar with bug MB-6490 but it is marked as fixed
      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        steve Steve Yen added a comment -

        raising to critical from major due to bug scrub mtg

        Show
        steve Steve Yen added a comment - raising to critical from major due to bug scrub mtg
        Show
        farshid Farshid Ghods (Inactive) added a comment - http://review.membase.org/#/c/21752/

          People

          • Assignee:
            alkondratenko Aleksey Kondratenko (Inactive)
            Reporter:
            thuan Thuan Nguyen
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes