Details
Description
Rebalance failed due to reason not_all_nodes_are_ready_yet as memcached exited on one node.
Test to reproduce:
./testrunner -i vm-list.ini -t swaprebalance.SwapRebalanceBasicTests.do_test,replica=1,num-buckets=2,num-swap=2,swap-orchestrator=True,GROUP=P1
Error received:
[2013-03-18 03:37:42,330] - [rest_client:913] ERROR -
{u'status': u'none', u'errorMessage': u'Rebalance failed. See logs for detailed reason. You can try rebalance again.'} - rebalance failed
[2013-03-18 03:37:42,330] - [rest_client:914] INFO - Latest logs from UI:
[2013-03-18 03:37:42,378] - [rest_client:915] ERROR - {u'node': u'ns_1@10.142.174.97', u'code': 2, u'text': u"Rebalance exited with reason
\n", u'shortText': u'message', u'module': u'ns_orchestrator', u'tstamp': 1363577863726, u'type': u'info'}
[2013-03-18 03:37:42,379] - [rest_client:915] ERROR -
[2013-03-18 03:37:42,379] - [rest_client:915] ERROR -
{u'node': u'ns_1@10.131.33.89', u'code': 0, u'text': u"Port server memcached on node 'ns_1@10.131.33.89' exited w ith status 71. Restarting. Messages: Mon Mar 18 03:37:38.278834 Coordinated Universal Time 3: bind(): No error\nMon Mar 18 03:37:38.278834 Coordinated Universal Time 3 : bind(): No error\nMon Mar 18 03:37:38.278834 Coordinated Universal Time 3: failed to listen on TCP port 11210: No error", u'shortText': u'message', u'module': u'ns_p ort_server', u'tstamp': 1363577858300, u'type': u'info'}[2013-03-18 03:37:42,379] - [rest_client:915] ERROR -
{u'node': u'ns_1@10.131.33.89', u'code': 0, u'text': u"Port server memcached on node 'ns_1@10.131.33.89' exited w ith status 71. Restarting. Messages: Mon Mar 18 03:37:33.114936 Coordinated Universal Time 3: bind(): No error\nMon Mar 18 03:37:33.114936 Coordinated Universal Time 3 : bind(): No error\nMon Mar 18 03:37:33.114936 Coordinated Universal Time 3: failed to listen on TCP port 11210: No error", u'shortText': u'message', u'module': u'ns_p ort_server', u'tstamp': 1363577853136, u'type': u'info'}[2013-03-18 03:37:42,379] - [rest_client:915] ERROR -
{u'node': u'ns_1@10.131.33.89', u'code': 0, u'text': u"Port server memcached on node 'ns_1@10.131.33.89' exited w ith status 71. Restarting. Messages: Mon Mar 18 03:37:27.936385 Coordinated Universal Time 3: bind(): No error\nMon Mar 18 03:37:27.936385 Coordinated Universal Time 3 : bind(): No error\nMon Mar 18 03:37:27.937385 Coordinated Universal Time 3: failed to listen on TCP port 11210: No error", u'shortText': u'message', u'module': u'ns_p ort_server', u'tstamp': 1363577847969, u'type': u'info'}[2013-03-18 03:37:42,379] - [rest_client:915] ERROR -
{u'node': u'ns_1@10.131.33.89', u'code': 1, u'text': u"Service memcached exited on node 'ns_1@10.131.33.89' in 0. 22s\n", u'shortText': u'port exited too soon after restart', u'module': u'supervisor_cushion', u'tstamp': 1363577842806, u'type': u'warning'}[2013-03-18 03:37:42,379] - [rest_client:915] ERROR -
{u'node': u'ns_1@10.131.33.89', u'code': 0, u'text': u"Port server memcached on node 'ns_1@10.131.33.89' exited with status 71. Restarting. Messages: Mon Mar 18 03:37:22.777869 Coordinated Universal Time 3: bind(): No error\nMon Mar 18 03:37:22.778869 Coordinated Universal Time 3: bind(): No error\nMon Mar 18 03:37:22.778869 Coordinated Universal Time 3: failed to listen on TCP port 11210: No error", u'shortText': u'message', u'module': u'ns_port_server', u'tstamp': 1363577842806, u'type': u'info'}[2013-03-18 03:37:42,379] - [rest_client:915] ERROR -
{u'node': u'ns_1@10.131.33.89', u'code': 1, u'text': u"Service memcached exited on node 'ns_1@10.131.33.89' in 0.20s\n", u'shortText': u'port exited too soon after restart', u'module': u'supervisor_cushion', u'tstamp': 1363577837582, u'type': u'warning'}[2013-03-18 03:37:42,380] - [rest_client:915] ERROR -
{u'node': u'ns_1@10.131.33.89', u'code': 0, u'text': u"Port server memcached on node 'ns_1@10.131.33.89' exited with status 71. Restarting. Messages: Mon Mar 18 03:37:17.551347 Coordinated Universal Time 3: bind(): No error\nMon Mar 18 03:37:17.551347 Coordinated Universal Time 3: bind(): No error\nMon Mar 18 03:37:17.551347 Coordinated Universal Time 3: failed to listen on TCP port 11210: No error", u'shortText': u'message', u'module': u'ns_port_server', u'tstamp': 1363577837582, u'type': u'info'}[2013-03-18 03:37:42,380] - [rest_client:915] ERROR -
{u'node': u'ns_1@10.131.33.89', u'code': 1, u'text': u"Service memcached exited on node 'ns_1@10.131.33.89' in 0.16s\n", u'shortText': u'port exited too soon after restart', u'module': u'supervisor_cushion', u'tstamp': 1363577832372, u'type': u'warning'}ERROR
The logs have rolled over and not available for this timestamp. Will try to repro again and attach logs.