Description
Have a 5 node cluster that was upgraded from 181 -> 202 (build 805).
After upgrade attempted to swap out orchestrator(10.3.121.69) and add in a new 202 node (10.3.3.131)
Rebalance fails with ns_vbucket_mover000 on orchestrator reporting:
<0.11037.1> exited with {noproc,
{gen_server,call,
[
,
{if_rebalance,<0.1.1>,initiate_indexing},
infinity]}}
On new node 202, couchdb went down and erlang crash dump was generated(attached).
Saw these errors from mccouch which may be why vbuckets couldn't be moved to this node:
Fri May 17 07:27:36.981472 PDT 3: (saslbucket) Trying to connect to mccouch: "127.0.0.1:11213"
Fri May 17 07:27:36.981615 PDT 3: (saslbucket) Connected to mccouch: "127.0.0.1:11213"
Fri May 17 07:27:37.019496 PDT 3: (saslbucket) Connection closed by mccouch
Fri May 17 07:27:37.019527 PDT 3: (saslbucket) Resetting connection to mccouch, lastReceivedCommand = notify_vbucket_update lastSentCommand = notify_vbucket_update currentCommand =unknown
Fri May 17 07:27:37.019595 PDT 3: (saslbucket) Trying to connect to mccouch: "127.0.0.1:11213"
Fri May 17 07:27:37.019730 PDT 3: (saslbucket) Connected to mccouch: "127.0.0.1:11213"
Fri May 17 07:27:37.021763 PDT 3: (saslbucket) Connection closed by mccouch
Fri May 17 07:27:37.021788 PDT 3: (saslbucket) Resetting connection to mccouch, lastReceivedCommand = select_bucket lastSentCommand = notify_vbucket_update currentCommand =unknown
=========================CRASH REPORT=========================
crasher:
initial call: mc_connection:init/1
pid: <0.965.0>
registered_name: []
exception error: no case clause matching
in function mc_connection:do_notify_vbucket_update/3
in call from mc_connection:handle_message/9
in call from mc_connection:read_full_message/2
in call from mc_connection:run_loop/2
ancestors: [mc_conn_sup,mc_sup,ns_server_sup,ns_server_cluster_sup,
<0.59.0>]
messages: []
links: <0.641.0>,#Port<0.6784>
dictionary: []
trap_exit: false
status: running
heap_size: 1597
stack_size: 24
reductions: 1094838
neighbours:
[error_logger:error,2013-05-17T7:27:36.622,ns_1@10.3.3.131:error_logger<0.6.0>:ale_error_logger_handler:log_report:72]
=========================SUPERVISOR REPORT=========================
Supervisor: {local,mc_conn_sup}
Context: child_terminated
Reason: {case_clause,{error,system_limit}}
Offender: [{pid,<0.965.0>},
{name,mc_connection},
{mfargs,{mc_connection,start_link,undefined}},
{restart_type,temporary},
{shutdown,brutal_kill},
{child_type,worker}]
=========================CRASH REPORT=========================
crasher:
initial call: couch_file:spawn_reader/2
pid: <0.652.0>
registered_name: []
exception exit: {problem_reopening_file,
{error,system_limit}
,
,
<0.652.0>,
"/opt/couchbase/var/lib/couchbase/data/_replicator.couch.1",
10}
in function couch_file:reader_loop/3
ancestors: [<0.650.0>,couch_server,couch_primary_services,
couch_server_sup,cb_couch_sup,ns_server_cluster_sup,
<0.59.0>]
messages: []
links: [<0.650.0>]
dictionary: []
trap_exit: true
status: running
heap_size: 377
stack_size: 24
reductions: 504
neighbours: