Details
-
Bug
-
Resolution: User Error
-
Major
-
5.5.4
-
None
-
5.5.4-4338 -> 5.5.4-4340
-
Untriaged
-
Centos 64-bit
-
-
No
Description
Script to Repro
./testrunner -i /tmp/win10-bucket-ops.ini -p upgrade_version=5.5.4-4340 -t newupgradetests.MultiNodesUpgradeTests.online_upgrade_swap_rebalance_with_high_doc_ops,initial_version=5.5.4-4338,items=1000000,nodes_init=3,run_with_views=False,flusher_batch_split_trigger=3
|
Steps to Repro
1) Create a 1 node cluster(5.5.4-4338) and bucket default
2) Set the following command and restart memcached
curl -i -u Administrator:password --data 'ns_bucket:update_bucket_props("default", [{extra_config_string, "flusher_batch_split_trigger=3"}]).' http://host:8091/diag/eval
|
3) Start dataload in progress
4) Rebalance in 1 node(5.5.4-4338)
5) Rebalance in another node(5.5.4-4338).
6) Check the following stats. It won't be true because of MB-34173 which is expected
last_persisted_snap_start <= last_persisted_seqno <= last_persisted_snap_end
|
7) Start data load again
8) Swap rebalance 1 5.5.4-4338 with 5.5.4-4340 node.
9) Repeat step 7) and 8) till the last 5.5.4-4338 node is swap rebalanced.
However it is noticed that the first two swap rebalance takes around 9 mins and 18 mins respectively. However the final swap rebalance takes unusually long time(close to 2.5 hours)
In Final swap rebalance the node in is 172.23.121.10(5.5.4-4340) and node being rebalanced out is 172.23.120.201(5.5.4-4338).
I see the following entries in logs(on 172.23.121.10).
[rebalance:debug,2019-05-23T07:30:34.618-07:00,ns_1@172.23.121.10:<0.16083.1>:janitor_agent:do_wait_seqno_persisted:982]Got etmpfail while waiting for sequence number 20547 to persist for vBucket:997. Will retry.
|
[rebalance:debug,2019-05-23T07:30:35.064-07:00,ns_1@172.23.121.10:<0.16092.1>:janitor_agent:do_wait_seqno_persisted:982]Got etmpfail while waiting for sequence number 20528 to persist for vBucket:671. Will retry.
|
[rebalance:debug,2019-05-23T07:30:35.070-07:00,ns_1@172.23.121.10:<0.16098.1>:janitor_agent:do_wait_seqno_persisted:982]Got etmpfail while waiting for sequence number 20636 to persist for vBucket:996. Will retry.
|
[rebalance:debug,2019-05-23T07:30:35.368-07:00,ns_1@172.23.121.10:<0.16141.1>:janitor_agent:do_wait_seqno_persisted:982]Got etmpfail while waiting for sequence number 20760 to persist for vBucket:995. Will retry.
|
[rebalance:debug,2019-05-23T07:30:35.380-07:00,ns_1@172.23.121.10:<0.16147.1>:janitor_agent:do_wait_seqno_persisted:982]Got etmpfail while waiting for sequence number 20887 to persist for vBucket:994. Will retry.
|
[rebalance:debug,2019-05-23T07:30:35.544-07:00,ns_1@172.23.121.10:<0.16153.1>:janitor_agent:do_wait_seqno_persisted:982]Got etmpfail while waiting for sequence number 20655 to persist for vBucket:670. Will retry.
|
cbccollect_info logs attached.