Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Cheshire-Cat
-
Centos 7 64 bit; Couchbase Enterprise Build 7.0.0-2840
-
Triaged
-
Centos 64-bit
-
-
1
-
No
-
KV Sprint 2020-Oct
Description
Summary:
Incorrect/Inconsistent num items resulting after dropping collections during failover and rebalance-out operation
Script to Repo:
./testrunner -i /tmp/durability_volume.ini sdk_client_pool=True,rerun=False,skip_validations=False,log_level=debug -t bucket_collections.collections_rebalance.CollectionsRebalance.test_data_load_collections_with_hard_failover_rebalance_out,nodes_init=5,nodes_failover=2,bucket_spec=multi_bucket.buckets_for_rebalance_tests_more_collections,data_load_spec=volume_test_load_with_CRUD_on_collections,data_load_stage=during,quota_percent=80,GROUP=failover_with_collection_crud |
Steps to Reproduce
1. Create 5 node cluster
2020-08-23 03:44:30,214 | test | INFO | pool-1-thread-7 | [table_view:display:72] Rebalance Overview
+-----------------+---------++--------------
|
|Nodes|Services|Status|
|
+-----------------+---------++--------------
|
|172.23.105.211|kv|Cluster node|
|
|172.23.105.212|None|<--- IN —|
|
|172.23.105.213|None|<--- IN —|
|
|172.23.105.215|None|<--- IN —|
|
|172.23.105.217|None|<--- IN —|
|
+-----------------+---------++--------------
|
2. Create 1 bucket, 61 collections and load data (2500 items each. Total 61x2500=152,500)
2020-08-23 03:49:23,144 | test | INFO | MainThread | [table_view:display:72] Bucket statistics
|
+----------+----------++----------------------++-------------++------------------------++-----------
|
|Bucket|Type|Replicas|Durability|TTL|Items|RAM Quota|RAM Used|Disk Used|
|
+----------+----------++----------------------++-------------++------------------------++-----------
|
|default|couchbase|3|none|0|152500|10485760000|398529632|542554790|
|
|
+----------+----------++----------------------++-------------++------------------------++-----------
|
3. Failover .215 and .217 one after the other and rebalance them out, while dropping 60 collections in parallel
4. Do the data validation
Expected Results
1 collection remains, with 2500 items .
Actual Results
1 collection remains, with 2512 items (12 additional items). Refer screenshots, and observations below.
Observations
Command 1:
/opt/couchbase/bin/cbstats localhost:11210 -u Administrator -p password all -a | grep curr_items | grep vb_active_curr_items |
Command 2
/opt/couchbase/bin/cbstats localhost:11210 -u Administrator -p password collections -a | grep items: | tr -s ' ' | cut -d ':' -f 4 | awk '{sum+=$1} END {print sum}' |
Command 1 on node .211 - 851 (7 items extra)
Command 2 on node .211 - 844
Command 1 on node .212 - 830 (5 items extra)
Command 1 on node .212 - 825
Command 1 on node .213 - 831
Command 2 on node .213 - 831 (matches properly on this node)
ie: sum of all items from all collections combined shows to be 2500(the correct items expected)
but the curr_items_tot reports an incorrect number.
- this issue is reproducible most of the times, but not always.
- Did not observe this issue with fewer items.
Attachments
Issue Links
- relates to
-
PYCBC-1046 SDK sending SETs to default collection when not asked too
- Open
-
MB-41253 [Magma] Underflow error in EPVBucket::decrNumTotalItems during EPBucket::compactionCompletionCallback
- Closed
For Gerrit Dashboard: MB-41092 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
140379,5 | MB-41092: Fix not moving purge seqno after dropping documents | master | kv_engine | Status: NEW | -2 | -1 |
140805,10 | MB-41092: Fix incorrect docs on disk | master | kv_engine | Status: NEW | 0 | -1 |