Details
Description
steps:
1. 3 nodes in cluster
2. turn on firewall on one node
3. wait while node become "unhealthy"
4. trigger graceful failover
[2014-03-17 18:36:26,188] - [remote_util:1450] INFO - running command.raw on 10.3.4.145: /sbin/iptables -A INPUT -p tcp -i eth0 --dport 1000:60000 -j REJECT
[2014-03-17 18:36:27,731] - [remote_util:1479] INFO - command executed successfully
[2014-03-17 18:36:27,731] - [remote_util:2263] INFO - enabled firewall on ip:10.3.4.145 port:8091 ssh_username:root
[2014-03-17 18:36:27,731] - [remote_util:1450] INFO - running command.raw on 10.3.4.145: /sbin/iptables --list
[2014-03-17 18:36:29,304] - [remote_util:1479] INFO - command executed successfully
[2014-03-17 18:36:29,304] - [remote_util:1401] INFO - Chain INPUT (policy ACCEPT)
[2014-03-17 18:36:29,305] - [remote_util:1401] INFO - target prot opt source destination
[2014-03-17 18:36:29,305] - [remote_util:1401] INFO - REJECT tcp – anywhere anywhere tcp dpts:cadlock2:60000 reject-with icmp-port-unreachable
[2014-03-17 18:36:29,305] - [remote_util:1401] INFO -
[2014-03-17 18:36:29,305] - [remote_util:1401] INFO - Chain FORWARD (policy ACCEPT)
[2014-03-17 18:36:29,306] - [remote_util:1401] INFO - target prot opt source destination
[2014-03-17 18:36:29,306] - [remote_util:1401] INFO -
[2014-03-17 18:36:29,306] - [remote_util:1401] INFO - Chain OUTPUT (policy ACCEPT)
[2014-03-17 18:36:29,306] - [remote_util:1401] INFO - target prot opt source destination
[2014-03-17 18:36:29,306] - [remote_util:1401] INFO -
[2014-03-17 18:36:29,306] - [remote_util:1401] INFO - Chain RH-Firewall-1-INPUT (0 references)
[2014-03-17 18:36:29,306] - [remote_util:1401] INFO - target prot opt source destination
[2014-03-17 18:36:32,503] - [rest_client:125] INFO - node ns_1@10.3.4.145 status : unhealthy
[2014-03-17 18:36:32,503] - [rest_client:132] INFO - node ns_1@10.3.4.145 status_reached : True
[2014-03-17 18:36:32,503] - [failovertests:72] INFO - node 10.3.4.145:8091 is 'unhealthy' as expected
[2014-03-17 18:36:33,727] - [rest_client:942] INFO - fail_over node ns_1@10.3.4.145 successful
[2014-03-17 18:36:34,905] - [rest_client:1075] INFO - rebalance percentage : 0 %
[2014-03-17 18:36:37,482] - [rest_client:1075] INFO - rebalance percentage : 0 %
[2014-03-17 18:36:40,441] - [rest_client:1075] INFO - rebalance percentage : 0 %
[2014-03-17 18:36:43,481] - [rest_client:1075] INFO - rebalance percentage : 0 %
[2014-03-17 18:36:46,512] - [rest_client:1075] INFO - rebalance percentage : 0 %
[2014-03-17 18:36:49,559] - [rest_client:1075] INFO - rebalance percentage : 0 %
[2014-03-17 18:36:52,518] - [rest_client:1075] INFO - rebalance percentage : 0 %
[2014-03-17 18:36:55,456] - [rest_client:1075] INFO - rebalance percentage : 0 %
[2014-03-17 18:36:58,390] - [rest_client:1075] INFO - rebalance percentage : 0 %
[2014-03-17 18:37:01,530] - [rest_client:1075] INFO - rebalance percentage : 0 %
[2014-03-17 18:37:04,490] - [rest_client:1059] ERROR -
- rebalance failed
[2014-03-17 18:37:09,870] - [rest_client:1838] INFO - Latest logs from UI:
[2014-03-17 18:37:09,870] - [rest_client:1839] ERROR - {u'node': u'ns_1@10.3.4.144', u'code': 2, u'text': u"Rebalance exited with reason
\n", u'shortText': u'message', u'serverTime': u'2014-03-17T08:09:14.238Z', u'module': u'ns_orchestrator', u'tstamp': 1395068954238, u'type': u'info'}
[2014-03-17 18:37:09,871] - [rest_client:1839] ERROR -
[2014-03-17 18:37:09,871] - [rest_client:1839] ERROR -
{u'node': u'ns_1@10.3.4.144', u'code': 1, u'text': u'Rebalance completed successfully.\n', u'shortText': u'message', u'serverTime': u'2014-03-17T08:06:08.669Z', u'module': u'ns_orchestrator', u'tstamp': 1395068768669, u'type': u'info'}[2014-03-17 18:37:09,871] - [rest_client:1839] ERROR -
{u'node': u'ns_1@10.3.4.144', u'code': 0, u'text': u'Bucket "default" rebalance does not seem to be swap rebalance', u'shortText': u'message', u'serverTime': u'2014-03-17T08:02:05.908Z', u'module': u'ns_vbucket_mover', u'tstamp': 1395068525908, u'type': u'info'}[2014-03-17 18:37:09,871] - [rest_client:1839] ERROR -
{u'node': u'ns_1@10.3.4.145', u'code': 0, u'text': u'Bucket "default" loaded on node \'ns_1@10.3.4.145\' in 0 seconds.', u'shortText': u'message', u'serverTime': u'2014-03-17T08:02:05.205Z', u'module': u'ns_memcached', u'tstamp': 1395068525205, u'type': u'info'}[2014-03-17 18:37:09,871] - [rest_client:1839] ERROR -
{u'node': u'ns_1@10.3.4.147', u'code': 0, u'text': u'Bucket "default" loaded on node \'ns_1@10.3.4.147\' in 0 seconds.', u'shortText': u'message', u'serverTime': u'2014-03-17T08:02:05.023Z', u'module': u'ns_memcached', u'tstamp': 1395068525023, u'type': u'info'}[2014-03-17 18:37:09,871] - [rest_client:1839] ERROR -
{u'node': u'ns_1@10.3.4.144', u'code': 0, u'text': u'Started rebalancing bucket default', u'shortText': u'message', u'serverTime': u'2014-03-17T08:02:04.681Z', u'module': u'ns_rebalancer', u'tstamp': 1395068524681, u'type': u'info'}[2014-03-17 18:37:09,872] - [rest_client:1839] ERROR -
{u'node': u'ns_1@10.3.4.144', u'code': 0, u'text': u'Bucket "bucket0" rebalance does not seem to be swap rebalance', u'shortText': u'message', u'serverTime': u'2014-03-17T07:57:48.975Z', u'module': u'ns_vbucket_mover', u'tstamp': 1395068268975, u'type': u'info'}[2014-03-17 18:37:09,872] - [rest_client:1839] ERROR -
{u'node': u'ns_1@10.3.4.145', u'code': 0, u'text': u'Bucket "bucket0" loaded on node \'ns_1@10.3.4.145\' in 0 seconds.', u'shortText': u'message', u'serverTime': u'2014-03-17T07:57:48.621Z', u'module': u'ns_memcached', u'tstamp': 1395068268621, u'type': u'info'}[2014-03-17 18:37:09,872] - [rest_client:1839] ERROR -
{u'node': u'ns_1@10.3.4.147', u'code': 0, u'text': u'Bucket "bucket0" loaded on node \'ns_1@10.3.4.147\' in 0 seconds.', u'shortText': u'message', u'serverTime': u'2014-03-17T07:57:48.296Z', u'module': u'ns_memcached', u'tstamp': 1395068268296, u'type': u'info'}Please note that we almost immediately got that node "unhealthy" and started graceful failover
it's better to get in response that we can not perform graceful failover due to node is unreachable