Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-6745

[system test] rebalance failed with error "Partition 67 not in active nor passive set" with consistent view enable

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • 2.0
    • 2.0
    • ns_server
    • Security Level: Public
    • centos 6.2 64bit build 2.0.0-1746

    Description

      Cluster information:

      • 8 centos 6.2 64bit server with 4 cores CPU
      • Each server has 32 GB RAM and 400 GB SSD disk.
      • 24.8 GB RAM for couchbase server at each node
      • SSD disk format ext4 on /data
      • Each server has its own drive, no disk sharing with other server.
      • Cluster has 2 buckets, default (12GB) and saslbucket (12GB) and setup cluster with consistent enable.
      • Each bucket has one doc and 2 views for each doc (default d1 and saslbucket d11)
      • Create cluster with 6 nodes installed couchbase server 2.0.0-1746

      10.6.2.37
      10.6.2.38
      10.6.2.39
      10.6.2.40
      10.6.2.42
      10.6.2.43

      • Load 28 million items to both bucket. Each key has size from 512 bytes to 1500 bytes

      Add 2 nodes 10.6.2.44, 10.6.2.45 and remove 2 node 10.6.2.40, 10.6.2.43
      Rebalance. Rebalance seems very slow. After 10 hours of running rebalance, I stop rebalance.
      Restart rebalance again. Rebalance failed with error

      Rebalance exited with reason {{{{badmatch,
      {error,

      {error, <<"Partition 67 not in active nor passive set">>}

      }},
      [

      {capi_set_view_manager,handle_call,3}

      ,

      {gen_server,handle_msg,5}

      ,

      {gen_server,init_it,6}

      ,

      {proc_lib,init_p_do_apply,3}]},
      {gen_server,call,
      ['capi_set_view_manager-saslbucket', {wait_index_updated,67},
      infinity]}},
      {gen_server,call,
      [{'janitor_agent-saslbucket','ns_1@10.6.2.42'},
      {if_rebalance,<0.16335.137>,
      {get_replication_persistence_checkpoint_id,
      684}},
      infinity]}}


      Server error during processing: ["web request failed", {path,"/pools/default/buckets/default/ddocs"}, {type,exit},
      {what,
      {noproc,
      {gen_server,call,
      ['capi_set_view_manager-default', {foreach_doc, #Fun<capi_ddoc_replication_srv.2.62853835>},
      infinity]}}},
      {trace,
      [{gen_server,call,3}, {capi_ddoc_replication_srv, full_live_ddocs,1}, {capi_ddoc_replication_srv, sorted_full_live_ddocs,1}, {menelaus_web_buckets,handle_ddocs_list,3}, {menelaus_web_buckets, checking_bucket_access,4}, {menelaus_web,loop,3}, {mochiweb_http,headers,5},{proc_lib,init_p_do_apply,3}

      ]}]

      Link to collect info of all nodes https://s3.amazonaws.com/packages.couchbase/collect_info/orange/2_0_0/201209/8nodes-col-info-1746-reb-failed-err-Partition_67_not_in_active_nor_passive_set-20120926-112422.tgz

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            alkondratenko Aleksey Kondratenko (Inactive)
            thuan Thuan Nguyen
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty