Description
build-705
steps:
1. 3 nodes in cluster with 1 sasl bucket and 10M items(10.3.121.112, 10.3.121.113, 10.3.121.114)
2. reboot all nodes at the same time
result:
10.3.121.112, 10.3.121.113 are in pending state, 10.3.121.114 is down with the error in the logs:
[error_logger:error,2012-08-19T20:31:39.066,ns_1@10.3.121.114:error_logger:ale_error_logger_handler:log_report:72]
=========================SUPERVISOR REPORT=========================
Supervisor:
Context: child_terminated
Reason: {noproc,
{gen_server,call,
[
,
]}}
Offender: [
,
,
{mfargs,{menelaus_web_alerts_srv,start_link,[]}},
{shutdown,5000},
{child_type,worker}]
[error_logger:error,2012-08-19T20:40:14.856,ns_1@10.3.121.114:error_logger:ale_error_logger_handler:log_msg:76]** Node 'ns_1@10.3.121.112' not responding **
** Removing (timedout) connection **
[ns_server:error,2012-08-19T20:40:56.438,ns_1@10.3.121.114:ns_doctor:ns_doctor:update_status:203]The following buckets became not ready on node 'ns_1@10.3.121.112': ["sasl"], those of them are active []
[error_logger:error,2012-08-19T20:42:34.008,ns_1@10.3.121.114:error_logger:ale_error_logger_handler:log_report:72]
=========================SUPERVISOR REPORT=========================
Supervisor: {local,'ns_vbm_new_sup-sasl'}
Context: child_terminated
Reason: normal
Offender: [{pid,<0.7117.0>},
{name,
{new_child_id,
[171,172,173,174,175,176,177,178,179,180,181,182,
183,184,185,186,187,188,189,190,191,192,193,194,
195,196,197,198,199,200,201,202,203,204,205,206,
207,208,209,210,211,212,213,214,215,216,217,218,
219,220,221,222,223,224,225,226,227,228,229,230,
231,232,233,234,235,236,237,238,239,240,241,242,
243,244,245,246,247,248,249,250,251,252,253,254,
255,256,257,258,259,260,261,262,263,264,265,266,
267,268,269,270,271,272,273,274,275,276,277,278,
279,280,281,282,283,284,285,286,287,288,289,290,
291,292,293,294,295,296,297,298,299,300,301,302,
303,304,305,306,307,308,309,310,311,312,313,314,
315,316,317,318,319,320,321,322,323,324,325,326,
327,328,329,330,331,332,333,334,335,336,337,338,
339,340,341],
'ns_1@10.3.121.112'}},
{mfargs,
{ebucketmigrator_srv,start_link,
[{"10.3.121.112",11209},
{"10.3.121.114",11209},
[{username,"sasl"},
{password,"sasl"},
{vbuckets, [171,172,173,174,175,176,177,178,179,180,181, 182,183,184,185,186,187,188,189,190,191,192, 193,194,195,196,197,198,199,200,201,202,203, 204,205,206,207,208,209,210,211,212,213,214, 215,216,217,218,219,220,221,222,223,224,225, 226,227,228,229,230,231,232,233,234,235,236, 237,238,239,240,241,242,243,244,245,246,247, 248,249,250,251,252,253,254,255,256,257,258, 259,260,261,262,263,264,265,266,267,268,269, 270,271,272,273,274,275,276,277,278,279,280, 281,282,283,284,285,286,287,288,289,290,291, 292,293,294,295,296,297,298,299,300,301,302, 303,304,305,306,307,308,309,310,311,312,313, 314,315,316,317,318,319,320,321,322,323,324, 325,326,327,328,329,330,331,332,333,334,335, 336,337,338,339,340,341]},
{takeover,false},
{suffix,"ns_1@10.3.121.114"}]]}},
{restart_type,permanent}
,
,
]
so, 10.3.121.114 didn't find orchestrator after restarting and didn't get up
error from orchestrator that hangs in pending state:
ns_server:warn,2012-08-19T20:54:16.524,ns_1@10.3.121.112:'capi_ddoc_replication_srv-sasl':cb_generic_replication_srv:handle_info:140]Remote server node
{'capi_ddoc_replication_srv-sasl','ns_1@10.3.121.114'} process down: noconnection
[error_logger:error,2012-08-19T20:54:16.525,ns_1@10.3.121.112:error_logger:ale_error_logger_handler:log_report:72]
=========================CRASH REPORT=========================
crasher:
initial call: ns_memcached:init/1
pid: <0.655.0>
registered_name: []
exception exit: badmatch,{error,timeout,
[
,
,
,
,
,
]}
in function gen_server:init_it/6
ancestors: ['ns_memcached_sup-sasl','single_bucket_sup-sasl',<0.552.0>]
messages: [check_started,check_started,check_started,check_started,
check_started,check_started,check_started,
{'$gen_call',
,connected},
check_started,check_started,
{'$gen_call',
,topkeys},
check_started,check_started,check_started,check_started,
check_started,check_started,check_started,check_started,
{'$gen_call',
,connected},
check_started,check_started,check_started,check_started,
check_started,check_started,check_started,check_started,
check_started,check_started,
{'$gen_call',
,connected},
check_started,check_started,check_started,check_started,
check_started,check_started,check_started,check_started,
check_started,check_started,
{'$gen_call',
,connected},
check_started,check_started,check_started,check_started,
check_started,check_started,check_started,check_started,
check_started,check_started,
{'$gen_call',
,connected},
check_started,check_started,check_started,check_started,
check_started,check_started,check_started,check_started,
check_started,check_started,
{'$gen_call',
,connected},
check_started,check_started,check_started]
links: <0.60.0>,<0.648.0>,#Port<0.7311>
dictionary: []
trap_exit: true
status: running
heap_size: 75025
stack_size: 24
reductions: 6393
neighbours:
Attachments
For Gerrit Dashboard: MB-6315 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
21331,1 | MB-6315: Redirect stderr and stdout to a file | master | ns_server | Status: ABANDONED | -1 | 0 |
21498,2 | MB-6315: redirect stdout and stderr of init script to log file | couchbase | voltron | Status: MERGED | +2 | +1 |