Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-7083

Rebalance exited with reason bulk_set_vbucket_state_failed EXIT' {{{{unexpected_reason,{timeout,{gen_server,call, m silence_upstream]

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • 2.0
    • 2.0-beta-2
    • ns_server
    • Security Level: Public
    • None

    Description

      version=2.0.0-1931-rel
      http://qa.hq.northscale.net/job/centos-64-2.0-rebalance-regressions/115/consoleFull
      ./testrunner -i /tmp/rebalance_regression.ini get-logs=True,wait_timeout=90 -t swaprebalance.SwapRebalanceFailedTests.test_add_back_failed_node,replica=2,num-buckets=3,num-swap=2,keys-count=1000000

      [2012-11-01 18:04:13,120] - [rest_client:881] INFO - rebalance params : password=password&ejectedNodes=&user=Administrator&knownNodes=ns_1%4010.3.121.94%2Cns_1%4010.3.121.92%2Cns_1%4010.3.121.98%2Cns_1%4010.3.121.96%2Cns_1%4010.3.121.93%2Cns_1%4010.3.121.97%2Cns_1%4010.3.121.95
      [2012-11-01 18:04:13,263] - [rest_client:888] INFO - rebalance operation started

      [2012-11-01 19:06:18,176] - [rest_client:984] INFO - rebalance percentage : 87.1795194549 %
      [2012-11-01 19:06:21,565] - [rest_client:969] ERROR -

      {u'status': u'none', u'errorMessage': u'Rebalance failed. See logs for detailed reason. You can try rebalance again.'}

      - rebalance failed

      [rebalance:info,2012-11-01T19:08:19.202,ns_1@10.3.121.92:<0.22780.38>:ebucketmigrator_srv:do_confirm_sent_messages:671]Got close ack!

      [ns_server:debug,2012-11-01T19:08:19.228,ns_1@10.3.121.92:<0.347.0>:ns_process_registry:handle_info:98]Got exit msg: {'EXIT',<0.23388.38>,
      {timeout,

      {gen_server,call,[<0.22819.38>,silence_upstream]}}}
      [ns_server:error,2012-11-01T19:08:19.228,ns_1@10.3.121.92:tap_replication_manager-bucket-0<0.29724.4>:misc:executing_on_new_process:1389]Got unexpected reason from <0.23388.38>: {timeout,
      {gen_server,call,
      [<0.22819.38>,silence_upstream]}}
      [ns_server:info,2012-11-01T19:08:19.242,ns_1@10.3.121.92:janitor_agent-bucket-0<0.23817.38>:janitor_agent:read_flush_counter:764]Loading flushseq failed: {error,enoent}. Assuming it's equal to global config.
      [rebalance:debug,2012-11-01T19:08:19.242,ns_1@10.3.121.92:<0.13346.37>:janitor_agent:bulk_set_vbucket_state:398]bulk vbucket state change failed for:
      [{'ns_1@10.3.121.92',
      {'EXIT',
      {{{{unexpected_reason,
      {timeout,{gen_server,call,[<0.22819.38>,silence_upstream]}

      }},
      [

      {misc,executing_on_new_process,1},
      {tap_replication_manager,change_vbucket_filter,4},
      {tap_replication_manager, '-do_set_incoming_replication_map/3-lc$^5/1-5-',2},
      {tap_replication_manager,do_set_incoming_replication_map,3},
      {tap_replication_manager,handle_call,3},
      {gen_server,handle_msg,5},
      {proc_lib,init_p_do_apply,3}]},
      {gen_server,call,
      ['tap_replication_manager-bucket-0',
      {change_vbucket_replication,575,undefined},
      infinity]}},
      {gen_server,call,
      [{'janitor_agent-bucket-0','ns_1@10.3.121.92'},
      {if_rebalance,<0.13346.37>,
      {update_vbucket_state,575,replica,undefined,undefined}},
      infinity]}}}}]
      [ns_server:info,2012-11-01T19:08:19.246,ns_1@10.3.121.92:janitor_agent-bucket-0<0.23817.38>:janitor_agent:read_flush_counter_from_config:771]Initialized flushseq 0 from bucket config
      [user:info,2012-11-01T19:08:19.253,ns_1@10.3.121.92:<0.4003.0>:ns_orchestrator:handle_info:319]Rebalance exited with reason {{bulk_set_vbucket_state_failed,
      [{'ns_1@10.3.121.92',
      {'EXIT',
      {{{{unexpected_reason,
      {timeout,
      {gen_server,call, [<0.22819.38>,silence_upstream]}}},
      [{misc,executing_on_new_process,1}

      ,

      {tap_replication_manager, change_vbucket_filter,4}

      ,

      {tap_replication_manager, '-do_set_incoming_replication_map/3-lc$^5/1-5-', 2}

      ,

      {tap_replication_manager, do_set_incoming_replication_map,3}

      ,

      {tap_replication_manager,handle_call,3}

      ,

      {gen_server,handle_msg,5},
      {proc_lib,init_p_do_apply,3}]},
      {gen_server,call,
      ['tap_replication_manager-bucket-0',
      {change_vbucket_replication,575, undefined},
      infinity]}},
      {gen_server,call,
      [{'janitor_agent-bucket-0', 'ns_1@10.3.121.92'},
      {if_rebalance,<0.13346.37>,
      {update_vbucket_state,575,replica,
      undefined,undefined}},
      infinity]}}}}]},
      [{janitor_agent,bulk_set_vbucket_state,4},
      {ns_vbucket_mover, update_replication_post_move,3},
      {ns_vbucket_mover,handle_info,2},
      {gen_server,handle_msg,5}

      ,

      {proc_lib,init_p_do_apply,3}

      ]}

      [ns_server:debug,2012-11-01T19:08:19.253,ns_1@10.3.121.92:<0.13367.37>:ns_pubsub:do_subscribe_link:132]Parent process of subscription

      {ns_node_disco_events,<0.13346.37>}

      exited with reason {{bulk_set_vbucket_state_failed,
      [{'ns_1@10.3.121.92',
      {'EXIT',
      {{{{unexpected_reason,
      {timeout,

      {gen_server, call, [<0.22819.38>, silence_upstream]}

      }},
      [

      {misc, executing_on_new_process, 1}

      ,

      {tap_replication_manager, change_vbucket_filter, 4}

      ,

      {tap_replication_manager, '-do_set_incoming_replication_map/3-lc$^5/1-5-', 2}

      ,

      {tap_replication_manager, do_set_incoming_replication_map, 3}

      ,

      {tap_replication_manager, handle_call, 3}

      ,

      {gen_server, handle_msg, 5}

      ,

      {proc_lib, init_p_do_apply, 3}

      ]},
      {gen_server,
      call,
      ['tap_replication_manager-bucket-0',

      {change_vbucket_replication, 575, undefined}

      ,
      infinity]}},
      {gen_server,
      call,
      [

      {'janitor_agent-bucket-0', 'ns_1@10.3.121.92'}

      ,
      {if_rebalance,
      <0.13346.37>,
      {update_vbucket_state,
      575,
      replica,
      undefined,
      undefined}},
      infinity]}}}}]},
      [

      {janitor_agent, bulk_set_vbucket_state, 4}

      ,

      {ns_vbucket_mover, update_replication_post_move, 3}

      ,

      {ns_vbucket_mover, handle_info, 2}

      ,

      {gen_server, handle_msg, 5}

      ,

      {proc_lib, init_p_do_apply, 3}

      ]}

      Unfortunately I can't provide collect_info, we don't grab it after failures now, wait aprove http://review.couchbase.org/#/c/22079/

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            alkondratenko Aleksey Kondratenko (Inactive)
            andreibaranouski Andrei Baranouski
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty