Details
-
Bug
-
Resolution: Duplicate
-
Major
-
2.0.1
-
Security Level: Public
-
windows R2 2008 64bit
Description
Environment:
- 9 windows 2008 R2 64bit.
- Each server has 4 CPU, 8GB RAM and SSD disk
- Cluster has 2 buckets, default and sasl bucket with consistent view enable.
- Load 26 million items to default bucket and 16 million items to sasl bucket. Each key has size from 128 to 512 bytes
- Each bucket has one doc and 2 views for each doc.
- Rebalance out 2 nodes 10.3.121.173 and 10.3.121.243
Starting rebalance, KeepNodes = ['ns_1@10.3.3.181','ns_1@10.3.121.47',
'ns_1@10.3.3.214','ns_1@10.3.3.182',
'ns_1@10.3.3.180','ns_1@10.3.121.171',
'ns_1@10.3.121.169'], EjectNodes = ['ns_1@10.3.121.173',
'ns_1@10.3.121.243'] ns_orchestrator004 ns_1@10.3.121.169 23:26:03 - Tue Jan 22, 2013
- Rebalance failed due to buckets were shutting down on orchestrator node.
ns_server:debug,2013-01-23T8:29:27.672,ns_1@10.3.121.169:ns_config_log<0.803.0>:ns_config_log:log_common:111]config change:
rebalance_status ->
[user:info,2013-01-23T8:29:26.219,ns_1@10.3.121.169:ns_memcached-default<0.968.1>:ns_memcached:terminate:661]Shutting down bucket "default" on 'ns_1@10.3.121.169' for server shutdown
[ns_server:error,2013-01-23T8:29:26.219,ns_1@10.3.121.169:timeout_diag_logger<0.699.0>:timeout_diag_logger:handle_call:104]
{<0.12009.70>,
[
,
,
{initial_call,{proc_lib,init_p,5}},
,
,
{garbage_collection,[
,
,
,
]},
,
,
,
,
,
,
]}
[ns_server:debug,2013-01-23T8:29:27.313,ns_1@10.3.121.169:<0.835.0>:ns_pubsub:do_subscribe_link:132]Parent process of subscription
{buckets_events,<0.833.0>} exited with reason {shutdown,
{gen_server,
call,
['ns_vbm_new_sup-sasl',
which_children,
infinity]}}
[ns_server:debug,2013-01-23T8:29:27.313,ns_1@10.3.121.169:ns_config_log<0.803.0>:ns_config_log:log_common:111]config change:
rebalancer_pid ->
undefined
[ns_server:debug,2013-01-23T8:29:27.329,ns_1@10.3.121.169:capi_set_view_manager-sasl<0.8923.0>:capi_set_view_manager:handle_info:349]doing replicate_newnodes_docs
[user:info,2013-01-23T8:29:27.329,ns_1@10.3.121.169:ns_memcached-sasl<0.8955.0>:ns_memcached:terminate:661]Shutting down bucket "sasl" on 'ns_1@10.3.121.169' for server shutdown
[ns_server:debug,2013-01-23T8:29:27.344,ns_1@10.3.121.169:ns_config_log<0.803.0>:ns_config_log:log_common:111]config change:
auto_failover_cfg ->
[
,
{timeout,30},
{max_nodes,1},
{count,0}]
[ns_server:debug,2013-01-23T8:29:27.360,ns_1@10.3.121.169:ns_config_rep<0.31635.76>:ns_config_rep:do_push_keys:317]Replicating some config keys ([auto_failover_cfg,autocompaction,buckets,
cluster_compat_version,counters,
dynamic_config_version]..)
[ns_server:debug,2013-01-23T8:29:27.360,ns_1@10.3.121.169:capi_set_view_manager-sasl<0.8923.0>:capi_set_view_manager:handle_info:349]doing replicate_newnodes_docs
[ns_server:error,2013-01-23T8:29:27.360,ns_1@10.3.121.169:timeout_diag_logger<0.699.0>:timeout_diag_logger:handle_call:104]
{<0.10831.67>,
- Memcached logs at time around rebalance failed
Wed Jan 23 08:29:27.208484 Pacific Standard Time 3: TAP (Consumer) eq_tapq:anon_18 - disconnected
Wed Jan 23 08:29:27.286609 Pacific Standard Time 3: TAP (Consumer) eq_tapq:anon_20 - disconnected
Wed Jan 23 08:29:28.145984 Pacific Standard Time 3: Schedule cleanup of "eq_tapq:anon_17"
Wed Jan 23 08:29:28.161609 Pacific Standard Time 3: Schedule cleanup of "eq_tapq:anon_18"
Wed Jan 23 08:29:28.161609 Pacific Standard Time 3: Schedule cleanup of "eq_tapq:anon_19"
Wed Jan 23 08:29:28.161609 Pacific Standard Time 3: Schedule cleanup of "eq_tapq:anon_20"
Wed Jan 23 08:29:28.177234 Pacific Standard Time 3: Schedule cleanup of "eq_tapq:anon_21"
Wed Jan 23 08:29:28.177234 Pacific Standard Time 3: Schedule cleanup of "eq_tapq:anon_22"
Wed Jan 23 08:29:28.192859 Pacific Standard Time 3: Schedule cleanup of "eq_tapq:anon_23"
Wed Jan 23 08:29:28.208484 Pacific Standard Time 3: Schedule cleanup of "eq_tapq:anon_24"
Wed Jan 23 08:29:29.005359 Pacific Standard Time 3: Shutting down tap connections!
Wed Jan 23 08:29:29.005359 Pacific Standard Time 3: Schedule cleanup of "eq_tapq:replication_ns_1@10.3.121.171"
Wed Jan 23 08:29:29.083484 Pacific Standard Time 3: Schedule cleanup of "eq_tapq:replication_ns_1@10.3.3.182"
Wed Jan 23 08:29:29.083484 Pacific Standard Time 3: Failed to notify thread: Unknown error
Wed Jan 23 08:29:29.114734 Pacific Standard Time 3: Schedule cleanup of "eq_tapq:replication_ns_1@10.3.121.47"
Wed Jan 23 08:29:29.114734 Pacific Standard Time 3: TAP (Producer) eq_tapq:replication_ns_1@10.3.121.171 - Clear the tap queues by force
Wed Jan 23 08:29:29.114734 Pacific Standard Time 3: Schedule cleanup of "eq_tapq:replication_ns_1@10.3.3.214"
Wed Jan 23 08:29:29.114734 Pacific Standard Time 3: Failed to notify thread: Unknown error
Wed Jan 23 08:29:29.114734 Pacific Standard Time 3: Schedule cleanup of "eq_tapq:replication_ns_1@10.3.3.180"
Wed Jan 23 08:29:29.114734 Pacific Standard Time 3: TAP (Producer) eq_tapq:replication_ns_1@10.3.3.182 - Clear the tap queues by force
Wed Jan 23 08:29:29.114734 Pacific Standard Time 3: Schedule cleanup of "eq_tapq:replication_ns_1@10.3.3.181"
Wed Jan 23 08:29:29.130359 Pacific Standard Time 3: Failed to notify thread: Unknown error
Wed Jan 23 08:29:29.130359 Pacific Standard Time 3: TAP (Producer) eq_tapq:replication_ns_1@10.3.121.47 - Clear the tap queues by force
Wed Jan 23 08:29:29.130359 Pacific Standard Time 3: TAP (Producer) eq_tapq:replication_ns_1@10.3.3.214 - Clear the tap queues by force
Wed Jan 23 08:29:29.130359 Pacific Standard Time 3: Failed to notify thread: Unknown error
Wed Jan 23 08:29:29.130359 Pacific Standard Time 3: TAP (Producer) eq_tapq:replication_ns_1@10.3.3.180 - Clear the tap queues by force
Wed Jan 23 08:29:29.130359 Pacific Standard Time 3: TAP (Producer) eq_tapq:replication_ns_1@10.3.3.181 - Clear the tap queues by force
Wed Jan 23 08:29:42.130359 Pacific Standard Time 3: Had to wait 12 s for shutdown
Wed Jan 23 08:30:01.442859 Pacific Standard Time 3: Shutting down tap connections!
Wed Jan 23 08:30:01.442859 Pacific Standard Time 3: Schedule cleanup of "eq_tapq:replication_ns_1@10.3.121.47"
Wed Jan 23 08:30:01.505359 Pacific Standard Time 3: Schedule cleanup of "eq_tapq:replication_ns_1@10.3.121.171"
Wed Jan 23 08:30:01.505359 Pacific Standard Time 3: TAP (Producer) eq_tapq:replication_ns_1@10.3.121.47 - Clear the tap queues by force
Wed Jan 23 08:30:01.505359 Pacific Standard Time 3: Schedule cleanup of "eq_tapq:replication_ns_1@10.3.3.181"
Wed Jan 23 08:30:01.505359 Pacific Standard Time 3: TAP (Producer) eq_tapq:replication_ns_1@10.3.121.171 - Clear the tap queues by force
Wed Jan 23 08:30:01.520984 Pacific Standard Time 3: Schedule cleanup of "eq_tapq:replication_ns_1@10.3.3.214"
Wed Jan 23 08:30:01.520984 Pacific Standard Time 3: TAP (Producer) eq_tapq:replication_ns_1@10.3.3.181 - Clear the tap queues by force
Wed Jan 23 08:30:01.520984 Pacific Standard Time 3: Schedule cleanup of "eq_tapq:replication_ns_1@10.3.3.180"
Wed Jan 23 08:30:01.536609 Pacific Standard Time 3: Schedule cleanup of "eq_tapq:replication_ns_1@10.3.3.182"
Wed Jan 23 08:30:01.520984 Pacific Standard Time 3: TAP (Producer) eq_tapq:replication_ns_1@10.3.3.214 - Clear the tap queues by force
Wed Jan 23 08:30:01.536609 Pacific Standard Time 3: TAP (Producer) eq_tapq:replication_ns_1@10.3.3.180 - Clear the tap queues by force
Wed Jan 23 08:30:01.536609 Pacific Standard Time 3: TAP (Producer) eq_tapq:replication_ns_1@10.3.3.182 - Clear the tap queues by force
Wed Jan 23 08:30:16.536609 Pacific Standard Time 3: Had to wait 15 s for shutdown
Link to manifest file of this build http://builds.hq.northscale.net/latestbuilds/couchbase-server-enterprise_x86_64_2.0.1-140-rel.setup.exe.manifest.xml
Attachments
Issue Links
- depends on
-
MB-7658 [Windows] severe timeouts in windows cluster with 2 buckets and 1 ddoc per bucket causes rebalance and other failures
- Closed