Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-10412

[system test][windows] rebalance failed after failover a node

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • 2.5.1
    • 2.5.1
    • couchbase-bucket
    • Security Level: Public
    • None
    • windows 2008 R2 64bit

    Description

      Environment:
      8 nodes physical server with 32 GB RAM each.
      Source:
      10.2.1.61
      10.2.1.62
      10.2.1.63

      10.2.1.64

      Target
      10.2.1.65
      10.2.1.66
      10.2.1.67

      10.2.1.68

      Couchbase cluster 2.5.1-1070 on both source and target.
      Create 3 nodes cluster at each source and target.
      Create 2 buckets (9 GB RAM each) with 1 replica on both cluster.
      Create unidirection from source to target at both buckets with xdcr set to version 2.
      Keys are loaded at source.

      After resident ration at source went down under 100%, do access phase in 3 hours.
      At source, add node 64 to cluseter => passed.
      At source, remove node 63 ==> passed
      At source, add back node 63 ==> passed
      At source, auto failover node 64 and rebalance => rebalance failed with errors:

      Rebalance exited with reason {{badmatch,{error,closed,
      [

      {mc_client_binary,cmd_binary_vocal_recv,5}

      ,

      {mc_client_binary,select_bucket,2}

      ,

      {ns_memcached,ensure_bucket,2}

      ,

      {ns_memcached,handle_info,2}

      ,

      {gen_server,handle_msg,5}

      ,

      {ns_memcached,init,1}

      ,

      {gen_server,init_it,6}

      ,

      {proc_lib,init_p_do_apply,3}

      ]},
      {gen_server,call,
      ['ns_memcached-sasl-2',

      {set_vbucket,338,active}

      ,
      180000]}},
      {gen_server,call,
      [

      {'janitor_agent-sasl-2','ns_1@10.2.1.64'}

      ,

      {get_tap_docs_estimate_many_taps,335, [<<"replication_building_335_'ns_1@10.2.1.61'">>, <<"replication_building_335_'ns_1@10.2.1.63'">>]}

      ,
      infinity]}}

      Before this error, I saw other error showing memcached disconnected on node 64

      Control connection to memcached on 'ns_1@10.2.1.64' disconnected: {{badmatch,
      {error,
      closed}},
      [

      {mc_client_binary, cmd_binary_vocal_recv, 5}

      ,

      {mc_client_binary, select_bucket, 2}

      ,

      {ns_memcached, ensure_bucket, 2}

      ,

      {ns_memcached, handle_info, 2}

      ,

      {gen_server, handle_msg, 5}

      ,

      {ns_memcached, init,1}

      ,

      {gen_server, init_it, 6}

      ,

      {proc_lib, init_p_do_apply, 3}

      ]}

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              thuan Thuan Nguyen
              thuan Thuan Nguyen
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty