Details
-
Bug
-
Resolution: Fixed
-
Major
-
Cheshire-Cat
-
7.0.0-2147
-
Untriaged
-
Centos 64-bit
-
-
1
-
No
Description
Script to repro:
./testrunner -i /tmp/durability_volume.ini -t volumetests.Collections.volume.test_volume_taf,nodes_init=4,replicas=1,num_failed_nodes=1,new_replica=1,graceful=True,bucket_spec=multi_bucket.buckets_for_volume_tests_with_ttl,iterations=1,doc_and_collection_ttl=True,data_load_spec=volume_test_load_with_doc_ttl,sdk_client_pool=True,quota_percent=100,rerun=False,skip_collections_cleanup=True,skip_cleanup=True
|
Steps to Repro
1) create a 4 node cluster
2020-05-22 21:47:10,243 | test | INFO | pool-2-thread-7 | [table_view:display:72] Rebalance Overview
------------------------------------
Nodes | Services | Status |
------------------------------------
172.23.121.81 | kv | Cluster node |
172.23.121.83 | None | <--- IN — |
172.23.121.85 | None | <--- IN — |
172.23.121.105 | None | <--- IN — |
------------------------------------
2) Create scope + collections + data load
2020-05-22 22:00:03,246 | test | INFO | MainThread | [table_view:display:72] Bucket statistics
----------------------------------------------------------------------+
Bucket | Type | Replicas | TTL | Items | RAM Quota | RAM Used | Disk Used |
----------------------------------------------------------------------+
bucket1 | membase | 3 | 0 | 20000 | 419430400 | 148296240 | 113683122 |
bucket2 | ephemeral | 3 | 0 | 30000 | 419430400 | 119000368 | 136 |
default | membase | 3 | 350 | 20041547 | 71303168000 | 17780179472 | 13332354608 |
----------------------------------------------------------------------+
3)2020-05-22 22:00:06,082 | test | INFO | MainThread | [Collections:test_volume_taf:122] Step 5: Rebalance in with Loading of docs
2020-05-22 22:00:09,119 | test | INFO | pool-2-thread-30 | [table_view:display:72] Rebalance Overview
------------------------------------
Nodes | Services | Status |
------------------------------------
172.23.121.81 | kv | Cluster node |
172.23.121.83 | kv | Cluster node |
172.23.121.105 | kv | Cluster node |
172.23.121.85 | kv | Cluster node |
172.23.121.138 | None | <--- IN — |
------------------------------------
4)2020-05-22 22:30:09,706 | test | INFO | MainThread | [Collections:test_volume_taf:134] Step 6: Rebalance Out with Loading of docs
2020-05-22 22:30:09,796 | test | INFO | pool-2-thread-10 | [table_view:display:72] Rebalance Overview
------------------------------------
Nodes | Services | Status |
------------------------------------
172.23.121.81 | kv | Cluster node |
172.23.121.83 | [u'kv'] | — OUT ---> |
172.23.121.105 | kv | Cluster node |
172.23.121.85 | kv | Cluster node |
172.23.121.138 | kv | Cluster node |
------------------------------------
5) 2020-05-22 23:03:08,331 | test | INFO | MainThread | [Collections:test_volume_taf:146] Step 7: Rebalance In_Out with Loading of docs
2020-05-22 23:03:13,733 | test | INFO | pool-2-thread-13 | [table_view:display:72] Rebalance Overview
------------------------------------
Nodes | Services | Status |
------------------------------------
172.23.121.81 | kv | Cluster node |
172.23.121.105 | kv | Cluster node |
172.23.121.85 | kv | Cluster node |
172.23.121.138 | [u'kv'] | — OUT ---> |
172.23.121.83 | None | <--- IN — |
172.23.121.114 | None | <--- IN — |
------------------------------------
Rebalance in/out fails with the following errror.
2020-05-22 23:05:41,430 | test | ERROR | pool-2-thread-13 | [rest_client:print_UI_logs:2537] {u'code': 0, u'module': u'ns_orchestrator', u'type': u'critical', u'node': u'ns_1@172.23.121.81', u'tstamp': 1590213937888L, u'shortText': u'message', u'serverTime': u'2020-05-22T23:05:37.888Z', u'text': u'Rebalance exited with reason {mover_crashed,\n {unexpected_exit,\n {\'EXIT\',<0.12092.35>,\n {{bulk_set_vbucket_state_failed,\n [{\'ns_1@172.23.121.138\',\n {\'EXIT\',\n {{{{{{{badmatch,{error,einval}},\n [{dcp_proxy,handle_packet,2,\n [{file,"src/dcp_proxy.erl"},\n {line,189}]},\n {dcp_proxy,process_data_loop,3,\n [{file,"src/dcp_proxy.erl"},\n {line,368}]},\n {dcp_proxy,handle_info,2,\n [{file,"src/dcp_proxy.erl"},\n {line,103}]},\n {gen_server,try_dispatch,4,\n [{file,"gen_server.erl"},\n {line,616}]},\n {gen_server,handle_msg,6,\n [{file,"gen_server.erl"},\n {line,686}]},\n {proc_lib,init_p_do_apply,3,\n [{file,"proc_lib.erl"},\n {line,247}]}]},\n {gen_server,call,\n [<25249.12306.1>,get_partitions,\n infinity]}},\n {gen_server,call,\n [<25249.12305.1>,get_partitions,\n infinity]}},\n {gen_server,call,\n [\'dcp_replication_manager-default\',\n {manage_replicators,\n [\'ns_1@172.23.121.105\',\n \'ns_1@172.23.121.81\',\n \'ns_1@172.23.121.85\']},\n infinity]}},\n {gen_server,call,\n [\'replication_manager-default\',\n {change_vbucket_replication,665,\n undefined},\n infinity]}},\n {gen_server,call,\n [{\'janitor_agent-default\',\n \'ns_1@172.23.121.138\'},\n {if_rebalance,<0.1134.34>,\n {update_vbucket_state,995,replica,\n undefined,undefined}},\n infinity]}}}}]},\n [{janitor_agent,bulk_set_vbucket_state,4,\n [{file,"src/janitor_agent.erl"},\n {line,403}]},\n {ns_single_vbucket_mover,\n \'-cleanup_old_streams/4-fun-1-\',4,\n [{file,"src/ns_single_vbucket_mover.erl"},\n {line,353}]},\n {proc_lib,init_p,3,\n [{file,"proc_lib.erl"},{line,232}]}]}}}}.\nRebalance Operation Id = 3593e7b121cb66b0003e4e31bc1f612c'}
|
cbcollect_info attached. This is the first time we are running this test.
Detailed steps available @ https://hub.internal.couchbase.com/confluence/pages/viewpage.action?pageId=50135893