Loading...

Details

Type: Bug
Resolution: Duplicate
Priority: Blocker
Fix Version/s: 3.0
Affects Version/s: 2.0, 2.0.1, 2.1.0
Component/s: ns_server
Security Level: Public
Labels:
None

Flagged:

Release Note
Sprint:
12/Aug - 30/Aug

Description

version=2.0.0-1947-rel
http://qa.hq.northscale.net/job/centos-64-2.0-failover-tests/448/consoleFull
/testrunner -i resources/jenkins/centos-64-5node-failover.ini get-logs=True,GROUP=BAT -t failovertests.FailoverTests.test_failover_firewall,replica=1,keys_count=100000,dgm_run=True,GROUP=BAT

steps:
1.3 nodes in cluster:10.1.3.114, 10.1.3.118,10.1.3.116
2.enabled firewall on ip:10.1.3.118 : /sbin/iptables -A INPUT -p tcp -i eth0 --dport 1000:60000 -j REJECT
3
[2012-11-12 11:28:37,102] - [failovertests:260] INFO - node 10.1.3.118:8091 is 'unhealthy' as expected
[2012-11-12 11:29:07,544] - [rest_client:849] INFO - fail_over successful
[2012-11-12 11:29:07,545] - [failovertests:278] INFO - failed over node : ns_1@10.1.3.118
[2012-11-12 11:29:07,545] - [failovertests:292] INFO - 10 seconds sleep after failover before invoking rebalance...
4.[2012-11-12 11:29:17,545] - [rest_client:883] INFO - rebalance params : password=password&ejectedNodes=ns_1%4010.1.3.118&user=Administrator&knownNodes=ns_1%4010.1.3.114%2Cns_1%4010.1.3.118%2Cns_1%4010.1.3.116

result:
[2012-11-12 11:29:17,562] - [rest_client:890] INFO - rebalance operation started
[2012-11-12 11:29:17,569] - [rest_client:986] INFO - rebalance percentage : 0 %
[2012-11-12 11:29:19,573] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:29:21,577] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:29:23,584] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:29:25,589] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:29:27,593] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:29:29,599] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:29:31,603] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:29:33,607] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:29:35,612] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:29:37,616] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:29:39,621] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:29:41,626] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:29:43,630] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:29:45,635] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:29:47,640] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:29:49,644] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:29:51,648] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:29:53,652] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:29:55,657] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:29:57,662] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:29:59,667] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:30:01,671] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:30:03,676] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:30:05,682] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:30:07,698] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:30:09,703] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:30:11,708] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:30:13,712] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:30:15,716] - [rest_client:986] INFO - rebalance percentage : 0.0 %
[2012-11-12 11:30:17,721] - [rest_client:971] ERROR -

{u'status': u'none', u'errorMessage': u'Rebalance failed. See logs for detailed reason. You can try rebalance again.'}

- rebalance failed

[rebalance:info,2012-11-12T11:49:03.581,ns_1@10.1.3.114:<0.20240.1>:ns_rebalancer:rebalance:258]Waiting for bucket "default" to be ready on ['ns_1@10.1.3.114',
'ns_1@10.1.3.116']
[ns_server:info,2012-11-12T11:49:09.386,ns_1@10.1.3.114:<0.769.0>:ns_orchestrator:handle_info:282]Skipping janitor in state rebalancing: {rebalancing_state,<0.20215.1>,
{dict,3,16,16,8,80,48,

{[],[],[],[],[],[],[],[],[],[],[],[], [],[],[],[]},
{{[],[],[],[],
[['ns_1@10.1.3.114'|0.0]],
[],
[['ns_1@10.1.3.116'|0.0]],
[],
[['ns_1@10.1.3.118'|0.0]],
[],[],[],[],[],[],[]}}},
['ns_1@10.1.3.114','ns_1@10.1.3.116'],
[],
['ns_1@10.1.3.118']}
[ns_server:info,2012-11-12T11:49:19.386,ns_1@10.1.3.114:<0.769.0>:ns_orchestrator:handle_info:282]Skipping janitor in state rebalancing: {rebalancing_state,<0.20215.1>,
{dict,3,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[], [],[],[],[]}

,
{{[],[],[],[],
[['ns_1@10.1.3.114'|0.0]],
[],
[['ns_1@10.1.3.116'|0.0]],
[],
[['ns_1@10.1.3.118'|0.0]],
[],[],[],[],[],[],[]}}},
['ns_1@10.1.3.114','ns_1@10.1.3.116'],
[],
['ns_1@10.1.3.118']}
[ns_server:info,2012-11-12T11:49:23.120,ns_1@10.1.3.114:<0.20319.1>:compaction_daemon:try_to_cleanup_indexes:439]Cleaning up indexes for bucket `default`
[ns_server:info,2012-11-12T11:49:23.121,ns_1@10.1.3.114:<0.20319.1>:compaction_daemon:spawn_bucket_compactor:404]Compacting bucket default with config:
[{database_fragmentation_threshold,{30,undefined}},
{view_fragmentation_threshold,{30,undefined}}]
[ns_server:info,2012-11-12T11:49:24.348,ns_1@10.1.3.114:ns_config_rep<0.358.0>:ns_config_rep:do_pull:341]Pulling config from: 'ns_1@10.1.3.116'

[ns_server:info,2012-11-12T11:49:29.386,ns_1@10.1.3.114:<0.769.0>:ns_orchestrator:handle_info:282]Skipping janitor in state rebalancing: {rebalancing_state,<0.20215.1>,
{dict,3,16,16,8,80,48,

{[],[],[],[],[],[],[],[],[],[],[],[], [],[],[],[]},
{{[],[],[],[],
[['ns_1@10.1.3.114'|0.0]],
[],
[['ns_1@10.1.3.116'|0.0]],
[],
[['ns_1@10.1.3.118'|0.0]],
[],[],[],[],[],[],[]}}},
['ns_1@10.1.3.114','ns_1@10.1.3.116'],
[],
['ns_1@10.1.3.118']}
[ns_server:warn,2012-11-12T11:49:29.546,ns_1@10.1.3.114:capi_set_view_manager-default<0.15508.1>:capi_set_view_manager:handle_info:345]Remote server node {'capi_ddoc_replication_srv-default','ns_1@10.1.3.118'} process down: noconnection
[ns_server:warn,2012-11-12T11:49:29.546,ns_1@10.1.3.114:xdc_rdoc_replication_srv<0.470.0>:xdc_rdoc_replication_srv:handle_info:128]Remote server node {xdc_rdoc_replication_srv,'ns_1@10.1.3.118'} process down: noconnection
[user:warn,2012-11-12T11:49:29.546,ns_1@10.1.3.114:ns_node_disco<0.351.0>:ns_node_disco:handle_info:168]Node 'ns_1@10.1.3.114' saw that node 'ns_1@10.1.3.118' went down.
[error_logger:error,2012-11-12T11:49:29.546,ns_1@10.1.3.114:error_logger<0.5.0>:ale_error_logger_handler:log_msg:76]** Node 'ns_1@10.1.3.118' not responding **
** Removing (timedout) connection **

[ns_server:info,2012-11-12T11:49:30.536,ns_1@10.1.3.114:ns_config_rep<0.358.0>:ns_config_rep:do_pull:341]Pulling config from: 'ns_1@10.1.3.116'

[ns_server:info,2012-11-12T11:49:39.386,ns_1@10.1.3.114:<0.769.0>:ns_orchestrator:handle_info:282]Skipping janitor in state rebalancing: {rebalancing_state,<0.20215.1>,
{dict,3,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[], [],[],[],[]}

,
{{[],[],[],[],
[['ns_1@10.1.3.114'|0.0]],
[],
[['ns_1@10.1.3.116'|0.0]],
[],
[['ns_1@10.1.3.118'|0.0]],
[],[],[],[],[],[],[]}}},
['ns_1@10.1.3.114','ns_1@10.1.3.116'],
[],
['ns_1@10.1.3.118']}
[ns_server:info,2012-11-12T11:49:49.386,ns_1@10.1.3.114:<0.769.0>:ns_orchestrator:handle_info:282]Skipping janitor in state rebalancing: {rebalancing_state,<0.20215.1>,
{dict,3,16,16,8,80,48,

{[],[],[],[],[],[],[],[],[],[],[],[], [],[],[],[]},
{{[],[],[],[],
[['ns_1@10.1.3.114'|0.0]],
[],
[['ns_1@10.1.3.116'|0.0]],
[],
[['ns_1@10.1.3.118'|0.0]],
[],[],[],[],[],[],[]}}},
['ns_1@10.1.3.114','ns_1@10.1.3.116'],
[],
['ns_1@10.1.3.118']}
[ns_server:info,2012-11-12T11:49:53.135,ns_1@10.1.3.114:<0.20445.1>:compaction_daemon:try_to_cleanup_indexes:439]Cleaning up indexes for bucket `default`
[ns_server:info,2012-11-12T11:49:53.136,ns_1@10.1.3.114:<0.20445.1>:compaction_daemon:spawn_bucket_compactor:404]Compacting bucket default with config:
[{database_fragmentation_threshold,{30,undefined}},
{view_fragmentation_threshold,{30,undefined}}]
[ns_server:info,2012-11-12T11:49:59.386,ns_1@10.1.3.114:<0.769.0>:ns_orchestrator:handle_info:282]Skipping janitor in state rebalancing: {rebalancing_state,<0.20215.1>,
{dict,3,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[], [],[],[],[]}

,
{{[],[],[],[],
[['ns_1@10.1.3.114'|0.0]],
[],
[['ns_1@10.1.3.116'|0.0]],
[],
[['ns_1@10.1.3.118'|0.0]],
[],[],[],[],[],[],[]}}},
['ns_1@10.1.3.114','ns_1@10.1.3.116'],
[],
['ns_1@10.1.3.118']}
[user:info,2012-11-12T11:50:03.582,ns_1@10.1.3.114:<0.769.0>:ns_orchestrator:handle_info:319]Rebalance exited with reason

{not_all_nodes_are_ready_yet, ['ns_1@10.1.3.114','ns_1@10.1.3.116']}

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

01774fe6-193d-4949-a7f5-eec408f72856-10.1.3.114-diag.txt.gz
21/Nov/12 5:30 AM
781 kB
Andrei Baranouski
01774fe6-193d-4949-a7f5-eec408f72856-10.1.3.115-diag.txt.gz
21/Nov/12 5:30 AM
321 kB
Andrei Baranouski
01774fe6-193d-4949-a7f5-eec408f72856-10.1.3.116-diag.txt.gz
21/Nov/12 5:30 AM
791 kB
Andrei Baranouski
01774fe6-193d-4949-a7f5-eec408f72856-10.1.3.117-diag.txt.gz
21/Nov/12 5:30 AM
304 kB
Andrei Baranouski
01774fe6-193d-4949-a7f5-eec408f72856-10.1.3.118-diag.txt.gz
21/Nov/12 5:30 AM
811 kB
Andrei Baranouski
5286537d-1565-4e27-a147-8c8b8f064759-10.1.3.114-diag.txt.gz
13/Nov/12 3:16 AM
4.73 MB
Andrei Baranouski
5286537d-1565-4e27-a147-8c8b8f064759-10.1.3.116-diag.txt.gz
13/Nov/12 3:16 AM
3.14 MB
Andrei Baranouski
5286537d-1565-4e27-a147-8c8b8f064759-10.1.3.118-diag.txt.gz
13/Nov/12 3:16 AM
2.68 MB
Andrei Baranouski

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

[Doc'd 2.2.0] failover of node that's completely down is still not quick (was: Rebalance exited with reason {not_all_nodes_are_ready_yet after failover node)

Details

Description

Attachments

Attachments

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty