Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
1.6.4
-
Security Level: Public
-
None
-
Physical RHEL linux boxes
Description
I did the following steps:
- Set up two node cluster (6 GB RAM on each node) with one replica
- Load 10 millions items by running memcachetest and mc_loader
- Add another node and rebalance
- Validate the items loaded by mc_loader
Rebalance was successful, but after some time, vbucket migrator processes on all three nodes were killed and restarted repeatedly. The data validation was also failed as 3 million items were missing in the system.
The output from running vbucketctl command to see how many vbuckets were active on each host:
chiyoung:management chiyoung$ python ./vbucketctl 10.2.1.51:11210 list | grep active | wc -l
0
chiyoung:management chiyoung$ python ./vbucketctl 10.2.1.53:11210 list | grep active | wc -l
342
chiyoung:management chiyoung$ python ./vbucketctl 10.2.1.54:11210 list | grep active | wc -l
0
Only 342 vbuckets were active on one node, and the rest of them were all in dead state.
The following is the log snippet from the web UI console:
Bucket "default" loaded on node 'ns_1@10.2.1.51' in 354 seconds. ns_memcached001 13:08:51 - Thu Nov 18, 2010
Control connection to memcached on 'ns_1@10.2.1.51' disconnected: {{badmatch,
{error,
closed}},
[
Port server memcached on node 'ns_1@10.2.1.51' exited with status 136. Restarting. Messages: Backfilling token for eq_tapq:anon_1 went invalid. Stopping backfill.
Backfilling token for eq_tapq:anon_1 went invalid. Stopping backfill.
Backfilling token for eq_tapq:anon_1 went invalid. Stopping backfill. ns_port_server000 13:02:56 - Thu Nov 18, 2010
Port server vbucketmigrator on node 'ns_1@10.2.1.53' exited with status 70. Restarting. Messages: Connecting to {Sock 10.2.1.53:11210}
Authenticating towards: {Sock 10.2.1.53:11210}
Authenticated towards: {Sock 10.2.1.53:11210} (repeated 19 times) ns_port_server000 13:01:28 - Thu Nov 18, 2010
Port server vbucketmigrator on node 'ns_1@10.2.1.51' exited with status 78. Restarting. Messages: Connecting to {Sock 10.2.1.54:11210}
Failed to connect to host: Failed to connect to [10.2.1.54:11210] (repeated 12 times) ns_port_server000 13:00:52 - Thu Nov 18, 2010
Port server vbucketmigrator on node 'ns_1@10.2.1.51' exited with status 70. Restarting. Messages: Authenticating towards: {Sock 10.2.1.51:11210}
Authenticated towards: {Sock 10.2.1.51:11210}
Downstream connection closed.. shutdown upstream (repeated 6 times) ns_port_server000 13:00:52 - Thu Nov 18, 2010
Port server vbucketmigrator on node 'ns_1@10.2.1.53' exited with status 70. Restarting. Messages: Connecting to {Sock 10.2.1.53:11210}
Authenticating towards: {Sock 10.2.1.53:11210}
Authenticated towards: {Sock 10.2.1.53:11210} ns_port_server000 12:55:39 - Thu Nov 18, 2010
Port server vbucketmigrator on node 'ns_1@10.2.1.53' exited with status 74. Restarting. Messages: An error occured on the downstream connection..
Downstream connection closed.. shutdown upstream
Had 360 pending messages at exit. ns_port_server000 12:55:34 - Thu Nov 18, 2010
Node 'ns_1@10.2.1.53' saw that node 'ns_1@10.2.1.54' came up. ns_node_disco004 12:55:33 - Thu Nov 18, 2010
Node 'ns_1@10.2.1.54' saw that node 'ns_1@10.2.1.53' came up. ns_node_disco004 12:55:32 - Thu Nov 18, 2010
Port server vbucketmigrator on node 'ns_1@10.2.1.51' exited with status 70. Restarting. Messages: Authenticating towards: {Sock 10.2.1.51:11210}
Authenticated towards: {Sock 10.2.1.51:11210}
Downstream connection closed.. shutdown upstream ns_port_server000 12:55:31 - Thu Nov 18, 2010
Port server vbucketmigrator on node 'ns_1@10.2.1.51' exited with status 78. Restarting. Messages: Connecting to {Sock 10.2.1.54:11210}
Failed to connect to host: Failed to connect to [10.2.1.54:11210] ns_port_server000 12:55:31 - Thu Nov 18, 2010
Port server vbucketmigrator on node 'ns_1@10.2.1.51' exited with status 74. Restarting. Messages: An error occured on the downstream connection..
Downstream connection closed.. shutdown upstream
Had 1270 pending messages at exit. ns_port_server000 12:55:31 - Thu Nov 18, 2010
Port server memcached on node 'ns_1@10.2.1.54' exited with status 137. Restarting. Messages: sqlite error: SQL logic error or missing database
sqlite error: SQL logic error or missing database
sqlite error: SQL logic error or missing database ns_port_server000 12:55:31 - Thu Nov 18, 2010
Control connection to memcached on 'ns_1@10.2.1.54' disconnected: {{badmatch,
{error,
closed}},
[{mc_client_binary,stats_recv,4}
,
{mc_client_binary, stats,4}, {ns_memcached, handle_call, 3}, {gen_server, handle_msg, 5}, {proc_lib, init_p_do_apply, 3}]} ns_memcached004 12:55:31 - Thu Nov 18, 2010Node 'ns_1@10.2.1.54' saw that node 'ns_1@10.2.1.51' came up. ns_node_disco004 12:55:29 - Thu Nov 18, 2010
Node 'ns_1@10.2.1.51' saw that node 'ns_1@10.2.1.54' came up. ns_node_disco004 12:55:29 - Thu Nov 18, 2010
Node 'ns_1@10.2.1.51' saw that node 'ns_1@10.2.1.54' went down. ns_node_disco005 12:49:15 - Thu Nov 18, 2010
Node 'ns_1@10.2.1.53' saw that node 'ns_1@10.2.1.54' went down. ns_node_disco005 12:49:14 - Thu Nov 18, 2010
Bucket "default" loaded on node 'ns_1@10.2.1.51' in 393 seconds. ns_memcached001 12:27:27 - Thu Nov 18, 2010
Port server vbucketmigrator on node 'ns_1@10.2.1.54' exited with status 70. Restarting. Messages: Connecting to {Sock 10.2.1.54:11210}
Authenticating towards: {Sock 10.2.1.54:11210}
Authenticated towards: {Sock 10.2.1.54:11210} (repeated 1 times) ns_port_server000 12:26:43 - Thu Nov 18, 2010
Port server vbucketmigrator on node 'ns_1@10.2.1.54' exited with status 70. Restarting. Messages: Authenticating towards: {Sock 10.2.1.54:11210}
Authenticated towards: {Sock 10.2.1.54:11210}
Downstream connection closed.. shutdown upstream (repeated 14 times) ns_port_server000 12:26:43 - Thu Nov 18, 2010
Port server vbucketmigrator on node 'ns_1@10.2.1.53' exited with status 78. Restarting. Messages: Connecting to {Sock 10.2.1.51:11210}
Failed to connect to host: Failed to connect to [10.2.1.51:11210] (repeated 15 times) ns_port_server000 12:26:28 - Thu Nov 18, 2010
Port server vbucketmigrator on node 'ns_1@10.2.1.53' exited with status 70. Restarting. Messages: Connecting to {Sock 10.2.1.53:11210}
Authenticating towards: {Sock 10.2.1.53:11210}
Authenticated towards: {Sock 10.2.1.53:11210} (repeated 2 times) ns_port_server000 12:26:28 - Thu Nov 18, 2010
Membase Server has started on web port 8091 on node 'ns_1@10.2.1.51'. menelaus_app001 12:20:53 - Thu Nov 18, 2010
Port server vbucketmigrator on node 'ns_1@10.2.1.53' exited with status 74. Restarting. Messages: An error occured on the downstream connection..
Downstream connection closed.. shutdown upstream
Had 173 pending messages at exit. ns_port_server000 12:20:52 - Thu Nov 18, 2010
Control connection to memcached on 'ns_1@10.2.1.53' disconnected: {{badmatch,
{error,
timeout}},
[{mc_client_binary, cmd_binary_vocal_recv, 5}, {mc_client_binary, delete_vbucket, 2},{ns_memcached,handle_call,3}, {gen_server, handle_msg, 5}, {proc_lib, init_p_do_apply, 3}]} (repeated 8 times) ns_memcached004 12:11:28 - Thu Nov 18, 2010
Bucket "default" loaded on node 'ns_1@10.2.1.53' in 0 seconds. (repeated 8 times) ns_memcached001 12:11:28 - Thu Nov 18, 2010
Bucket "default" loaded on node 'ns_1@10.2.1.53' in 0 seconds. ns_memcached001 12:05:29 - Thu Nov 18, 2010
Control connection to memcached on 'ns_1@10.2.1.53' disconnected: {{badmatch,
{error,
timeout}},
[{mc_client_binary, cmd_binary_vocal_recv, 5}, {mc_client_binary, delete_vbucket, 2}, {ns_memcached, handle_call, 3},{gen_server,handle_msg,5}, {proc_lib, init_p_do_apply, 3}]} ns_memcached004 12:05:29 - Thu Nov 18, 2010
Bucket "default" loaded on node 'ns_1@10.2.1.54' in 2698 seconds. ns_memcached001 11:21:45 - Thu Nov 18, 2010
Bucket "default" loaded on node 'ns_1@10.2.1.53' in 657 seconds. ns_memcached001 10:59:50 - Thu Nov 18, 2010
Control connection to memcached on 'ns_1@10.2.1.53' disconnected: {{badmatch,
{error,
closed}},
[{mc_client_binary, stats_recv, 4},{mc_client_binary,stats,4}
,
{ns_memcached, handle_call, 3}, {gen_server, handle_msg, 5}, {proc_lib, init_p_do_apply, 3}]} ns_memcached004 10:48:53 - Thu Nov 18, 2010Port server memcached on node 'ns_1@10.2.1.53' exited with status 136. Restarting. Messages: Backfilling token for eq_tapq:anon_21 went invalid. Stopping backfill.
Backfilling token for eq_tapq:anon_21 went invalid. Stopping backfill.
Backfilling token for eq_tapq:anon_21 went invalid. Stopping backfill. ns_port_server000 10:48:53 - Thu Nov 18, 2010
Bucket "default" loaded on node 'ns_1@10.2.1.51' in 167 seconds. ns_memcached001 10:45:10 - Thu Nov 18, 2010
Port server vbucketmigrator on node 'ns_1@10.2.1.53' exited with status 70. Restarting. Messages: Authenticating towards: {Sock 10.2.1.53:11210}
Authenticated towards: {Sock 10.2.1.53:11210}
Downstream connection closed.. shutdown upstream (repeated 18 times) ns_port_server000 10:42:28 - Thu Nov 18, 2010
Port server memcached on node 'ns_1@10.2.1.51' exited with status 136. Restarting. Messages: Backfilling token for eq_tapq:anon_2 went invalid. Stopping backfill.
Backfilling token for eq_tapq:anon_2 went invalid. Stopping backfill.
Backfilling token for eq_tapq:anon_2 went invalid. Stopping backfill. ns_port_server000 10:42:21 - Thu Nov 18, 2010
Control connection to memcached on 'ns_1@10.2.1.51' disconnected: {{badmatch,
{error,
closed}},
[{mc_client_binary, stats_recv, 4}, {mc_client_binary, stats,4},{ns_memcached,handle_call,3}
,
{gen_server, handle_msg, 5}, {proc_lib, init_p_do_apply, 3}]} ns_memcached004 10:42:21 - Thu Nov 18, 2010Port server vbucketmigrator on node 'ns_1@10.2.1.51' exited with status 78. Restarting. Messages: Connecting to {Sock 10.2.1.54:11210}
Failed to connect to host: Failed to connect to [10.2.1.54:11210] (repeated 9 times) ns_port_server000 10:41:52 - Thu Nov 18, 2010
Port server vbucketmigrator on node 'ns_1@10.2.1.51' exited with status 70. Restarting. Messages: Authenticating towards: {Sock 10.2.1.51:11210}
Authenticated towards: {Sock 10.2.1.51:11210}
Downstream connection closed.. shutdown upstream (repeated 9 times) ns_port_server000 10:41:52 - Thu Nov 18, 2010
Port server vbucketmigrator on node 'ns_1@10.2.1.53' exited with status 70. Restarting. Messages: Connecting to {Sock 10.2.1.53:11210}
Authenticating towards: {Sock 10.2.1.53:11210}
Authenticated towards: {Sock 10.2.1.53:11210} ns_port_server000 10:36:47 - Thu Nov 18, 2010
Control connection to memcached on 'ns_1@10.2.1.54' disconnected: {{badmatch,
{error,
closed}},
[{mc_client_binary, stats_recv, 4}, {mc_client_binary, stats,4}, {ns_memcached, handle_call, 3},{gen_server,handle_msg,5}
,
{proc_lib, init_p_do_apply, 3}]} ns_memcached004 10:36:47 - Thu Nov 18, 2010
Port server vbucketmigrator on node 'ns_1@10.2.1.53' exited with status 70. Restarting. Messages: Authenticating towards:
Authenticated towards:
{Sock 10.2.1.53:11210}Downstream connection closed.. shutdown upstream ns_port_server000 10:36:47 - Thu Nov 18, 2010
Port server vbucketmigrator on node 'ns_1@10.2.1.53' exited with status 74. Restarting. Messages: Failed to read from stream: Connection reset by peer
An error occured on the downstream connection..
Downstream connection closed.. shutdown upstream ns_port_server000 10:36:47 - Thu Nov 18, 2010
Port server vbucketmigrator on node 'ns_1@10.2.1.51' exited with status 70. Restarting. Messages: Authenticating towards:
Authenticated towards:
{Sock 10.2.1.51:11210}Downstream connection closed.. shutdown upstream ns_port_server000 10:36:47 - Thu Nov 18, 2010
Port server vbucketmigrator on node 'ns_1@10.2.1.51' exited with status 78. Restarting. Messages: Connecting to
Failed to connect to host: Failed to connect to [10.2.1.54:11210] ns_port_server000 10:36:47 - Thu Nov 18, 2010
Port server vbucketmigrator on node 'ns_1@10.2.1.51' exited with status 74. Restarting. Messages: Failed to read from stream: Connection reset by peer
An error occured on the downstream connection..
Downstream connection closed.. shutdown upstream ns_port_server000 10:36:47 - Thu Nov 18, 2010
Port server memcached on node 'ns_1@10.2.1.54' exited with status 136. Restarting. Messages: 35: FATAL: The engine does not support tap
35: FATAL: The engine does not support tap
35: FATAL: The engine does not support tap ns_port_server000 10:36:47 - Thu Nov 18, 2010
Bucket "default" loaded on node 'ns_1@10.2.1.54' in 3270 seconds. ns_memcached001 10:22:41 - Thu Nov 18, 2010
Bucket "default" loaded on node 'ns_1@10.2.1.53' in 832 seconds. ns_memcached001 09:42:22 - Thu Nov 18, 2010
Attachments
Issue Links
- blocks
-
MB-2472 vbucketmigrator exit with status 70 and 78
- Closed
- is duplicated by
-
MB-2472 vbucketmigrator exit with status 70 and 78
- Closed
-
MB-2696 missing data on long-running rebalance test
- Closed
-
MB-2697 rebalance hangs on long-running rebalance test
- Closed
-
MB-2890 Data lost when upgrade and remove old node
- Closed
-
MB-2881 vbucketmigrator exited with status 70 and 3 after rebalance 2 nodes
- Closed