Details
-
Bug
-
Resolution: Duplicate
-
Critical
-
Cheshire-Cat
-
Couchbase EE 7.0.0-4437
-
Triaged
-
Centos 64-bit
-
1
-
No
-
KV-Engine 2021-March
Description
I was trying to reproduce the issue MB-44079
Script to repo
Cherry pick http://review.couchbase.org/c/TAF/+/145788 on top of TAF and run:
guides/gradlew testrunner -P jython=/opt/jython/bin/jython -P args="-i /tmp/durability_volume.ini -p rerun=False,get-cbcollect-info=True,collect_pcaps=True -t bucket_collections.collections_rebalance.CollectionsRebalance.test_temp_MB,nodes_init=3,nodes_in=2,override_spec_params=durability;replicas,durability=MAJORITY,replicas=2,bucket_spec=multi_bucket.buckets_all_ephemeral -m rest"
|
Steps to Reproduce
1. Create a 3 node cluster
+----------------+----------+-----------------------+---------------+--------------+
|
| Nodes | Services | Version | CPU | Status |
|
+----------------+----------+-----------------------+---------------+--------------+
|
| 172.23.105.215 | kv | 7.0.0-4437-enterprise | 1.64099974754 | Cluster node | |
| 172.23.105.217 | None | | | <--- IN --- | |
| 172.23.105.219 | None | | | <--- IN --- | |
+----------------+----------+-----------------------+---------------+--------------+
|
2. Create 3 ephemeral buckets
+---------+-----------+----------+------------+-----+--------+------------+-----------+-----------+
|
| Bucket | Type | Replicas | Durability | TTL | Items | RAM Quota | RAM Used | Disk Used |
|
+---------+-----------+----------+------------+-----+--------+------------+-----------+-----------+
|
| bucket1 | ephemeral | 2 | none | 0 | 20050 | 314572800 | 89136808 | 102 | |
| bucket2 | ephemeral | 2 | none | 0 | 15075 | 314572800 | 78653032 | 102 | |
| default | ephemeral | 2 | none | 0 | 500000 | 4718592000 | 754276120 | 102 | |
+---------+-----------+----------+------------+-----+--------+------------+-----------+-----------+
|
3. Change rebalanceMovesPerNode to 1
2021-02-11 05:37:59,687 | test | INFO | MainThread | [cluster_ready_functions:set_rebalance_moves_per_nodes:119] Changed Rebalance settings: {u'rebalanceMovesPerNode': 1} |
4. Rebalance-in 2 nodes with durability MAJORITY level
5. Rebalance-out 2 nodes with durability MAJORITY level
Do the above 4th and 5th in cycles
Rebalance-in failed at the second cycle
2021-02-11 05:48:14,161 | test | INFO | pool-1-thread-5 | [table_view:display:72] Rebalance Overview |
+----------------+----------+-----------------------+---------------+--------------+
|
| Nodes | Services | Version | CPU | Status |
|
+----------------+----------+-----------------------+---------------+--------------+
|
| 172.23.105.215 | kv | 7.0.0-4437-enterprise | 23.736207339 | Cluster node | |
| 172.23.105.217 | kv | 7.0.0-4437-enterprise | 20.6993363961 | Cluster node | |
| 172.23.105.219 | kv | 7.0.0-4437-enterprise | 20.7320194523 | Cluster node | |
| 172.23.105.220 | None | | | <--- IN --- | |
| 172.23.106.237 | None | | | <--- IN --- | |
+----------------+----------+-----------------------+---------------+--------------+
|
Looks like we lost connection to memcached on .237 during the above rebalance for some reason.
Attachments
Issue Links
- duplicates
-
MB-44079 Ephemeral out of order purging can cause prepares to be recommitted and DurabilityMonitor montonicity exceptions to throw
-
- Closed
-
For Gerrit Dashboard: MB-44255 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
145788,14 | CBQE-6496/MB-44255/MB-44079: Coverage for ephemeral | master | TAF | Status: MERGED | +2 | +1 |
145948,3 | MB-44255: Add some extra logging on ADM->PDM | master | kv_engine | Status: MERGED | +2 | +1 |
146431,2 | MB-44255: Sanity check for eph PDM snap end | master | kv_engine | Status: ABANDONED | 0 | +1 |
146462,3 | MB-44255: Add _vbucket-details so that we can dump seqlist | master | kv_engine | Status: MERGED | +2 | +1 |
147688,5 | Reduce ADM and PDM debug logging | master | kv_engine | Status: MERGED | +2 | +1 |