Details
-
Bug
-
Resolution: Duplicate
-
Major
-
None
-
Cheshire-Cat
-
7.0.0-4346-enterprise
-
Triaged
-
Centos 64-bit
-
1
-
Yes
Description
Script to Repro
guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/win10-bucket-ops.ini rerun=False,quota_percent=95,crash_warning=True,GROUP=failover_with_collection_crud_durability_MAJORITY -t bucket_collections.collections_rebalance.CollectionsRebalance.test_data_load_collections_with_graceful_failover_rebalance_out,nodes_init=5,nodes_failover=1,override_spec_params=durability;replicas,durability=MAJORITY,replicas=2,bucket_spec=multi_bucket.buckets_for_rebalance_tests_more_collections,data_load_spec=volume_test_load_with_CRUD_on_collections,data_load_stage=during,quota_percent=80,GROUP=failover_with_collection_crud_durability_MAJORITY'
|
Steps to Repro
1) Create a 5 node cluster
2021-01-31 08:29:33,272 | test | INFO | pool-1-thread-6 | [table_view:display:72] Rebalance Overview
------------------------------------
Nodes | Services | Status |
------------------------------------
172.23.98.196 | kv | Cluster node |
172.23.98.195 | None | <--- IN — |
172.23.121.10 | None | <--- IN — |
172.23.104.186 | None | <--- IN — |
172.23.120.206 | None | <--- IN — |
------------------------------------
2) Create buckets/scopes/collections.
--------------------------------------------------------------------------
Bucket | Type | Replicas | Durability | TTL | Items | RAM Quota | RAM Used | Disk Used |
--------------------------------------------------------------------------
bucket1 | couchbase | 2 | none | 0 | 3000 | 1048576000 | 15118121 | 357353918 |
bucket2 | ephemeral | 2 | none | 0 | 3000 | 1048576000 | 198787274 | 170 |
default | couchbase | 2 | none | 0 | 500000 | 10485760000 | 339767192 | 602148970 |
--------------------------------------------------------------------------
3) Start collection CRUD + durability data load
4) Start graceful failover. It fails as shown below.
2021-01-31 08:39:26,507 | test | ERROR | pool-1-thread-16 | [rest_client:print_UI_logs:2595] {u'code': 0, u'module': u'ns_orchestrator', u'type': u'critical', u'node': u'ns_1@172.23.98.196', u'tstamp': 1612111164914L, u'shortText': u'message', u'serverTime': u'2021-01-31T08:39:24.914Z', u'text': u'Graceful failover exited with reason {mover_crashed,\n {{{{badmatch,{error,timeout}},\n [{mc_client_binary,stats_recv,4,\n [{file,"src/mc_client_binary.erl"},\n {line,171}]},\n {mc_client_binary,stats,4,\n [{file,"src/mc_client_binary.erl"},\n {line,482}]},\n {ns_memcached,do_handle_call,3,\n [{file,"src/ns_memcached.erl"},\n {line,453}]},\n {ns_memcached,worker_loop,3,\n [{file,"src/ns_memcached.erl"},\n {line,224}]},\n {proc_lib,init_p_do_apply,3,\n [{file,"proc_lib.erl"},\n {line,249}]}]},\n {gen_server,call,\n [\'ns_memcached-default\',\n {get_dcp_docs_estimate,285,\n "replication:ns_1@172.23.120.206->ns_1@172.23.121.10:default"},\n 180000]}},\n {gen_server,call,\n [{\'janitor_agent-default\',\n \'ns_1@172.23.120.206\'},\n {if_rebalance,<0.13184.2>,\n {get_vbucket_high_seqno,385}},\n infinity]}}}.\nRebalance Operation Id = 765e6b6d308dcd677bb6c4ef88b8b2c3'}
|
2021-01-31 08:39:26,509 | test | ERROR | pool-1-thread-16 | [rest_client:print_UI_logs:2595] {u'code': 0, u'module': u'ns_vbucket_mover', u'type': u'critical', u'node': u'ns_1@172.23.98.196', u'tstamp': 1612111164805L, u'shortText': u'message', u'serverTime': u'2021-01-31T08:39:24.805Z', u'text': u'Worker <0.31602.2> (for action {move,{385,\n [\'ns_1@172.23.120.206\',\n \'ns_1@172.23.98.196\',\n \'ns_1@172.23.121.10\'],\n [\'ns_1@172.23.98.196\',\n \'ns_1@172.23.121.10\',\n \'ns_1@172.23.120.206\'],\n []}}) exited with reason {{{{badmatch,\n {error,\n timeout}},\n [{mc_client_binary,\n stats_recv,\n 4,\n [{file,\n "src/mc_client_binary.erl"},\n {line,\n 171}]},\n {mc_client_binary,\n stats,4,\n [{file,\n "src/mc_client_binary.erl"},\n {line,\n 482}]},\n {ns_memcached,\n do_handle_call,\n 3,\n [{file,\n "src/ns_memcached.erl"},\n {line,\n 453}]},\n {ns_memcached,\n worker_loop,\n 3,\n [{file,\n "src/ns_memcached.erl"},\n {line,\n 224}]},\n {proc_lib,\n init_p_do_apply,\n 3,\n [{file,\n "proc_lib.erl"},\n {line,\n 249}]}]},\n {gen_server,\n call,\n [\'ns_memcached-default\',\n {get_dcp_docs_estimate,\n 285,\n "replication:ns_1@172.23.120.206->ns_1@172.23.121.10:default"},\n 180000]}},\n {gen_server,\n call,\n [{\'janitor_agent-default\',\n \'ns_1@172.23.120.206\'},\n {if_rebalance,\n <0.13184.2>,\n {get_vbucket_high_seqno,\n 385}},\n infinity]}}'}
|
cbcollect_info attached.
I think this would be regression as it was not seen on 7.0.0-4325.
Attachments
Issue Links
- duplicates
-
MB-44021 [Collections] - AddressSanitizer: seen during graceful failover + full recovery
- Closed