Details
-
Bug
-
Resolution: Fixed
-
Critical
-
7.6.0
-
7.6.0-2032
-
Triaged
-
Linux x86_64
-
0
-
Yes
Description
Steps To Recreate:
- Create a 5 node cluster
- Create two magma buckets(default and bucket2) and a couchstore bucket(bucket 1 )
- All three buckets are created with replicas=2 and ttl=1100
- Load 34985500 docs to default bucket and 50k docs in remaining two buckets(bucket1 & bucket2)
- Trigger swap rebalance(Nodes coming in :172.23.120.92, 172.23.106.217, Nodes going out: 172.23.120.144, 172.23.122.135)
- Start new doc ops while swap rebalance is going on
- Rebalance fails and Observed below memcached logs
Below logs are from node 172.23.122.135
Memcached Logs:
2024-01-19T23:44:10.438599-08:00 ERROR 5295: Exception occurred during packet execution. Closing connection [ {"ip":"172.23.120.148","port":45232} - {"ip":"172.23.122.135","port":11206} (System, @ns_server) ]: ThrowExceptionUnderflowPolicy current:442266 arg:447717. Cookies: [{"aiostat":"success","ewouldblock":false,"packet":{"bodylen":4,"cas":0,"datatype":"raw","extlen":4,"extras":{"buffer_bytes":447717},"keylen":0,"magic":"ClientRequest","opaque":2396,"opcode":"DCP_BUFFER_ACKNOWLEDGEMENT","vbucket":0},"refcount":1,"started":"81708854755859 (311 us ago)","throttled":false}] Exception thrown from: ["#0 /opt/couchbase/bin/memcached() [0x400000+0x1443b9]","#1 /opt/couchbase/bin/memcached() [0x400000+0x4b5ee7]","#2 /opt/couchbase/bin/memcached() [0x400000+0x4b684d]","#3 /opt/couchbase/bin/memcached() [0x400000+0x4b6a35]","#4 /opt/couchbase/bin/memcached() [0x400000+0x4b6d25]","#5 /opt/couchbase/bin/memcached() [0x400000+0x343584]","#6 /opt/couchbase/bin/memcached() [0x400000+0x1d333d]","#7 /opt/couchbase/bin/memcached() [0x400000+0x2a6e30]","#8 /opt/couchbase/bin/memcached() [0x400000+0x22243b]","#9 /opt/couchbase/bin/memcached() [0x400000+0x208b98]","#10 /opt/couchbase/bin/memcached() [0x400000+0x20d9f5]","#11 /opt/couchbase/bin/memcached() [0x400000+0x217f27]","#12 /opt/couchbase/bin/../lib/libevent_core-2.1.so.7() [0x7fd853975000+0xf84e]","#13 /opt/couchbase/bin/../lib/libevent_core-2.1.so.7() [0x7fd853975000+0x18b99]","#14 /opt/couchbase/bin/../lib/libevent_core-2.1.so.7(event_base_loop+0x357) [0x7fd853975000+0x19287]","#15 /opt/couchbase/bin/memcached() [0x400000+0x981a96]","#16 /opt/couchbase/bin/memcached() [0x400000+0x981f9e]","#17 /opt/couchbase/bin/memcached() [0x400000+0x984468]","#18 /opt/couchbase/bin/memcached() [0x400000+0x231a29]","#19 /opt/couchbase/bin/memcached() [0x400000+0x8d156c]","#20 /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7fd852bb0000+0xe4aa3]","#21 /lib/x86_64-linux-gnu/libpthread.so.0() [0x7fd85330d000+0x7fa3]","#22 /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7fd852846000+0xf8eff]"]
|
2024-01-19T23:47:01.510775-08:00 ERROR 1518: Exception occurred during packet execution. Closing connection [ {"ip":"172.23.106.217","port":39886} - {"ip":"172.23.122.135","port":11206} (System, @ns_server) ]: ThrowExceptionUnderflowPolicy current:2237077 arg:2241383. Cookies: [{"aiostat":"success","ewouldblock":false,"packet":{"bodylen":4,"cas":0,"datatype":"raw","extlen":4,"extras":{"buffer_bytes":2241383},"keylen":0,"magic":"ClientRequest","opaque":99,"opcode":"DCP_BUFFER_ACKNOWLEDGEMENT","vbucket":0},"refcount":1,"started":"81879927140999 (115 us ago)","throttled":false}] Exception thrown from: ["#0 /opt/couchbase/bin/memcached() [0x400000+0x1443b9]","#1 /opt/couchbase/bin/memcached() [0x400000+0x4b5ee7]","#2 /opt/couchbase/bin/memcached() [0x400000+0x4b684d]","#3 /opt/couchbase/bin/memcached() [0x400000+0x4b6945]","#4 /opt/couchbase/bin/memcached() [0x400000+0x4b6d25]","#5 /opt/couchbase/bin/memcached() [0x400000+0x343584]","#6 /opt/couchbase/bin/memcached() [0x400000+0x1d333d]","#7 /opt/couchbase/bin/memcached() [0x400000+0x2a6e30]","#8 /opt/couchbase/bin/memcached() [0x400000+0x22243b]","#9 /opt/couchbase/bin/memcached() [0x400000+0x208b98]","#10 /opt/couchbase/bin/memcached() [0x400000+0x20d9f5]","#11 /opt/couchbase/bin/memcached() [0x400000+0x217f27]","#12 /opt/couchbase/bin/../lib/libevent_core-2.1.so.7() [0x7fd853975000+0xf84e]","#13 /opt/couchbase/bin/../lib/libevent_core-2.1.so.7() [0x7fd853975000+0x18b99]","#14 /opt/couchbase/bin/../lib/libevent_core-2.1.so.7(event_base_loop+0x357) [0x7fd853975000+0x19287]","#15 /opt/couchbase/bin/memcached() [0x400000+0x981a96]","#16 /opt/couchbase/bin/memcached() [0x400000+0x981f9e]","#17 /opt/couchbase/bin/memcached() [0x400000+0x984468]","#18 /opt/couchbase/bin/memcached() [0x400000+0x231a29]","#19 /opt/couchbase/bin/memcached() [0x400000+0x8d156c]","#20 /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7fd852bb0000+0xe4aa3]","#21 /lib/x86_64-linux-gnu/libpthread.so.0() [0x7fd85330d000+0x7fa3]","#22 /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7fd852846000+0xf8eff]"]
|
Rebalance Failures:
u'critical', u'node': u'ns_1@172.23.120.148', u'tstamp': 1705736821580L, u'shortText': u'message', u'serverTime': u'2024-01-19T23:47:01.580Z', u'text': u'Rebalance exited with reason {mover_crashed,\n {unexpected_exit,\n {\'EXIT\',<0.10270.12>,\n {{dcp_wait_for_data_move_failed,"default",\n 943,\'ns_1@172.23.122.135\',\n [\'ns_1@172.23.120.92\',\'ns_1@172.23.120.148\',\n \'ns_1@172.23.106.217\'],\n {error,\n {unexpected_status,\n <<"connection_does_not_exist">>},\n "Error getting dcp stats on \'ns_1@172.23.122.135\' for bucket \\"default\\", partition 943, connection \\"replication:ns_1@172.23.122.135->ns_1@172.23.106.217:default\\": {unexpected_status,\\n <<\\"connection_does_not_exist\\">>}"}},\n [{ns_single_vbucket_mover,\n \'-wait_dcp_data_move/5-fun-0-\',5,\n [{file,"src/ns_single_vbucket_mover.erl"},\n {line,453}]},\n {proc_lib,init_p,3,\n [{file,"proc_lib.erl"},{line,225}]}]}}}}.\nRebalance Operation Id = fb08157923750b87287308717487549f'}
|
2024-01-19 23:47:12,378 | test | ERROR | pool-5-thread-12 | [rest_client:print_UI_logs:2668] {u'code': 0, u'module': u'ns_vbucket_mover', u'type': u'critical', u'node': u'ns_1@172.23.120.148', u'tstamp': 1705736821546L, u'shortText': u'message', u'serverTime': u'2024-01-19T23:47:01.546Z', u'text': u'Worker <0.10234.12> (for action {move,{943,\n [\'ns_1@172.23.122.135\',\n \'ns_1@172.23.120.148\',\n \'ns_1@172.23.120.144\'],\n [\'ns_1@172.23.120.92\',\n \'ns_1@172.23.120.148\',\n \'ns_1@172.23.106.217\'],\n []}}) exited with reason {unexpected_exit,\n {\'EXIT\',\n <0.10270.12>,\n {{dcp_wait_for_data_move_failed,\n "default",\n 943,\n \'ns_1@172.23.122.135\',\n [\'ns_1@172.23.120.92\',\n \'ns_1@172.23.120.148\',\n \'ns_1@172.23.106.217\'],\n {error,\n {unexpected_status,\n <<"connection_does_not_exist">>},\n "Error getting dcp stats on \'ns_1@172.23.122.135\' for bucket \\"default\\", partition 943, connection \\"replication:ns_1@172.23.122.135->ns_1@172.23.106.217:default\\": {unexpected_status,\\n <<\\"connection_does_not_exist\\">>}"}},\n [{ns_single_vbucket_mover,\n \'-wait_dcp_data_move/5-fun-0-\',\n 5,\n [{file,\n "src/ns_single_vbucket_mover.erl"},\n {line,\n 453}]},\n {proc_lib,\n init_p,3,\n [{file,\n "proc_lib.erl"},\n {line,\n 225}]}]}}}'}
|
QE-TEST:
guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/testexec.47533.ini GROUP=swap_rebalance_P0_set1,rerun=False,disk_optimized_thread_settings=True,get-cbcollect-info=True,autoCompactionDefined=true,upgrade_version=7.6.0-2032,sirius_url=http://172.23.120.103:4000 -t bucket_collections.collections_rebalance.CollectionsRebalance.test_data_load_collections_with_swap_rebalance,nodes_init=5,nodes_swap=2,bucket_spec=magma_dgm.5_percent_dgm.5_node_2_replica_magma_ttl_256,doc_size=256,randomize_value=True,data_load_spec=ttl_load1,data_load_stage=during,skip_validations=True,sleep_before_validation_of_ttl=1300,GROUP=swap_rebalance_P0_set1'
|
Note:
Haven't observed this in last weekly run (Build 7.6.0-2005), Hence marking it as regression.
Attachments
Issue Links
- is duplicated by
-
MB-60473 ThrowExceptionUnderflowPolicy current: 1809408 arg: 1810543 for DCP_BUFFER_ACKNOWLEDGEMENT
- Closed
-
MB-60476 ThrowExceptionUnderflowPolicy current:270108 arg:279424 for DCP_BUFFER_ACKNOWLEDGEMENT
- Closed
-
MB-60524 ThrowExceptionUnderflowPolicy current:28 arg:4750227
- Closed
-
MB-60576 [Upgrade] : Rebalance exited with reason {mover_crashed,{unexpected_exit,{'EXIT',<0.2089.12>,{{dcp_wait_for_data_move_failed,"bucket-0"
- Closed