Details
-
Bug
-
Resolution: Cannot Reproduce
-
Critical
-
None
-
7.1.0
-
7.1.0-1220
-
Triaged
-
-
1
-
Unknown
Description
Steps:
- Create a 3 node cluster
- Create 1 bucket, 9+default scope and 10 collections/scope = 100 collections in total
- Create 1000000 items sequentially per collection
- Update 1000000 RandonKey keys to create 50 percent fragmentation per collection
- Create 1000000 random keys per collection
- Update 1000000 random keys to create 50 percent fragmentation per collection
- Rebalance in with Loading of docs
- Rebalance Out with Loading of docs
- Rebalance In_Out with Loading of docs
- Swap with Loading of docs
- Failover a node and RebalanceOut that node with loading in parallel
- Failover a node and FullRecovery that node and rebalance cluster -> Bucket Shutdown wait failed
{u'code': 0, u'module': u'ns_orchestrator', u'type': u'critical', u'node': u'ns_1@172.23.110.64', u'tstamp': 1630303517487L, u'shortText': u'message', u'serverTime': u'2021-08-29T23:05:17.487Z', u'text': u'Rebalance exited with reason {buckets_shutdown_wait_failed,\n [{\'ns_1@172.23.110.66\',\n {\'EXIT\',\n {old_buckets_shutdown_wait_failed,\n ["GleamBookUsers0"]}}}]}.\nRebalance Operation Id = 216f6ab628e7f979c7fe2b4854e2ede0'}
|
2021-08-29 23:05:17,971 | test | ERROR | pool-3-thread-30 | [rest_client:print_UI_logs:2786] {u'code': 0, u'module': u'ns_rebalancer', u'type': u'critical', u'node': u'ns_1@172.23.110.64', u'tstamp': 1630303517485L, u'shortText': u'message', u'serverTime': u'2021-08-29T23:05:17.485Z', u'text': u'Failed to wait deletion of some buckets on some nodes: [{\'ns_1@172.23.110.66\',\n {\'EXIT\',\n {old_buckets_shutdown_wait_failed,\n ["GleamBookUsers0"]}}}]\n'}
|
2021-08-29 23:05:17,971 | test | ERROR | pool-3-thread-30 | [rest_client:print_UI_logs:2786] {u'code': 0, u'module': u'ns_orchestrator', u'type': u'info', u'node': u'ns_1@172.23.110.64', u'tstamp': 1630303457482L, u'shortText': u'message', u'serverTime': u'2021-08-29T23:04:17.482Z', u'text': u"Starting rebalance, KeepNodes = ['ns_1@172.23.110.64','ns_1@172.23.110.66',\n 'ns_1@172.23.110.65'], EjectNodes = [], Failed over and being ejected nodes = []; no delta recovery nodes; Operation Id = 216f6ab628e7f979c7fe2b4854e2ede0"}
|
2021-08-29 23:05:17,973 | test | ERROR | pool-3-thread-30 | [rest_client:print_UI_logs:2786] {u'code': 0, u'module': u'ns_orchestrator', u'type': u'info', u'node': u'ns_1@172.23.110.64', u'tstamp': 1630303421004L, u'shortText': u'message', u'serverTime': u'2021-08-29T23:03:41.004Z', u'text': u'Graceful failover completed successfully.\nRebalance Operation Id = deb95d22d6e9250f9507f271830cdf16'}
|
2021-08-29 23:05:17,973 | test | ERROR | pool-3-thread-30 | [rest_client:print_UI_logs:2786] {u'code': 0, u'module': u'ns_memcached', u'type': u'info', u'node': u'ns_1@172.23.110.66', u'tstamp': 1630303420920L, u'shortText': u'message', u'serverTime': u'2021-08-29T23:03:40.920Z', u'text': u'Shutting down bucket "GleamBookUsers0" on \'ns_1@172.23.110.66\' for deletion'}
|
2021-08-29 23:05:17,973 | test | ERROR | pool-3-thread-30 | [rest_client:print_UI_logs:2786] {u'code': 0, u'module': u'failover', u'type': u'info', u'node': u'ns_1@172.23.110.64', u'tstamp': 1630303420886L, u'shortText': u'message', u'serverTime': u'2021-08-29T23:03:40.886Z', u'text': u"Deactivating failed over nodes ['ns_1@172.23.110.66']"}
|
2021-08-29 23:05:17,973 | test | ERROR | pool-3-thread-30 | [rest_client:print_UI_logs:2786] {u'code': 0, u'module': u'failover', u'type': u'info', u'node': u'ns_1@172.23.110.64', u'tstamp': 1630303420879L, u'shortText': u'message', u'serverTime': u'2021-08-29T23:03:40.879Z', u'text': u"Failed over ['ns_1@172.23.110.66']: ok"}
|
2021-08-29 23:05:17,973 | test | ERROR | pool-3-thread-30 | [rest_client:print_UI_logs:2786] {u'code': 0, u'module': u'failover', u'type': u'info', u'node': u'ns_1@172.23.110.64', u'tstamp': 1630303420692L, u'shortText': u'message', u'serverTime': u'2021-08-29T23:03:40.692Z', u'text': u"Starting failing over ['ns_1@172.23.110.66']"}
|
2021-08-29 23:05:17,973 | test | ERROR | pool-3-thread-30 | [rest_client:print_UI_logs:2786] {u'code': 0, u'module': u'ns_vbucket_mover', u'type': u'info', u'node': u'ns_1@172.23.110.64', u'tstamp': 1630303405222L, u'shortText': u'message', u'serverTime': u'2021-08-29T23:03:25.222Z', u'text': u'Bucket "GleamBookUsers0" rebalance does not seem to be swap rebalance'}
|
2021-08-29 23:05:17,973 | test | ERROR | pool-3-thread-30 | [rest_client:print_UI_logs:2786] {u'code': 0, u'module': u'ns_rebalancer', u'type': u'info', u'node': u'ns_1@172.23.110.64', u'tstamp': 1630303405155L, u'shortText': u'message', u'serverTime': u'2021-08-29T23:03:25.155Z', u'text': u"Starting vbucket moves for graceful failover of ['ns_1@172.23.110.66']"}
|
2021-08-29 23:05:17,974 | test | ERROR | pool-3-thread-30 | [task:call:269] Rebalance Failed: {u'errorMessage': u'Rebalance failed. See logs for detailed reason. You can try again.', u'type': u'rebalance', u'masterRequestTimedOut': False, u'statusId': u'6a263d67b212bbe7d12d3f75cd442b39', u'statusIsStale': False, u'lastReportURI': u'/logs/rebalanceReport?reportID=287965ee676b56b6c2af48f665b8f107', u'status': u'notRunning'}
|
QE Test |
guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/magma_temp_job3.ini -p bucket_storage=magma,bucket_eviction_policy=fullEviction,rerun=False -t volumetests.Magma.volume.ClusterOpsVolume,nodes_init=3,replicas=1,num_failed_nodes=1,new_replica=1,graceful=True,skip_cleanup=True,num_items=1000000,num_buckets=1,bucket_names=GleamBook,doc_size=1024,bucket_type=membase,eviction_policy=fullEviction,compression_mode=off,iterations=2,sdk_timeout=60,log_level=info,infra_log_level=debug,rerun=False,skip_cleanup=True,key_size=20,randomize_doc_size=False,randomize_value=True,assert_crashes_on_load=True,collection_prefix=Volume,num_scopes=9,num_collections=5,pc=25,ops_rate=200000,key_type=RandomKey,ramQuota=102400,mutation_perc=20 -m rest'
|
Attachments
Issue Links
- duplicates
-
MB-49215 [ARM][System Test] rebalance failure observed in component level test - buckets_shutdown_wait_failed
- Closed
- is duplicated by
-
MB-50139 [10TB, Unbounded, KV]: buckets_shutdown_wait_failed on the node failed over gracefully and added back with full recovery followed by rebalanced.
- Closed