Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Unresolved
Priority: Critical
Fix Version/s: Morpheus
Affects Version/s: 7.1.1
Component/s: couchbase-bucket
Labels:
- volume-test
Environment:
Enterprise Edition 7.1.1 build 3067

Triage:
Untriaged
Link to Log File, atop/blg, CBCollectInfo, Core dump:

Hide
http://supportal.couchbase.com/snapshot/3423c9d174e573f25c3b5b335c1744c3::0

s3://cb-customers-secure/rebalance_stuck/2022-06-09/collectinfo-2022-06-09t182551-ns_1@172.23.110.64.zip
s3://cb-customers-secure/rebalance_stuck/2022-06-09/collectinfo-2022-06-09t182551-ns_1@172.23.110.65.zip
s3://cb-customers-secure/rebalance_stuck/2022-06-09/collectinfo-2022-06-09t182551-ns_1@172.23.110.67.zip
s3://cb-customers-secure/rebalance_stuck/2022-06-09/collectinfo-2022-06-09t182551-ns_1@172.23.110.68.zip
s3://cb-customers-secure/rebalance_stuck/2022-06-09/collectinfo-2022-06-09t182551-ns_1@172.23.110.70.zip

Show
http://supportal.couchbase.com/snapshot/3423c9d174e573f25c3b5b335c1744c3::0 s3://cb-customers-secure/rebalance_stuck/2022-06-09/collectinfo-2022-06-09t182551-ns_1@172.23.110.64.zip s3://cb-customers-secure/rebalance_stuck/2022-06-09/collectinfo-2022-06-09t182551-ns_1@172.23.110.65.zip s3://cb-customers-secure/rebalance_stuck/2022-06-09/collectinfo-2022-06-09t182551-ns_1@172.23.110.67.zip s3://cb-customers-secure/rebalance_stuck/2022-06-09/collectinfo-2022-06-09t182551-ns_1@172.23.110.68.zip s3://cb-customers-secure/rebalance_stuck/2022-06-09/collectinfo-2022-06-09t182551-ns_1@172.23.110.70.zip
Story Points:
1
Is this a Regression?:
Unknown
Sprint:
KV June 2022, KV July 2022, KV Aug 2022

Description

Create a 3 node KV cluster
Create a magma bucket with 1 replica with RAM=200GB
Load 10B 1024 bytes documents. This is 20TB of Active + replica and puts the bucket in 1% DGM.
Upsert the whole data to create 50% fragmentation.
Create 25 datasets on cbas ingesting data from different collections. Let the ingestion start. Start SQL++ load with 10QPS asynchronously.

Start an asnyc CRUD data load:

Read Start: 0

Read End: 100000000

Update Start: 0

Update End: 100000000

Expiry Start: 0

Expiry End: 0

Delete Start: 100000000

Delete End: 200000000

Create Start: 200000000

Create End: 300000000

Final Start: 200000000

Final End: 300000000

Rebalance in 1 KV node. Rebalance seem to be stuck since hours...

QE Test

guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/magma_temp_job3.ini -p bucket_storage=magma,bucket_eviction_policy=fullEviction,rerun=False -t aGoodDoctor.Hospital.Murphy.ClusterOpsVolume,nodes_init=3,graceful=True,skip_cleanup=True,num_items=100000000,num_buckets=1,bucket_names=GleamBook,doc_size=1300,bucket_type=membase,eviction_policy=fullEviction,iterations=2,batch_size=1000,sdk_timeout=60,log_level=debug,infra_log_level=debug,rerun=False,skip_cleanup=True,key_size=18,randomize_doc_size=False,randomize_value=True,assert_crashes_on_load=True,num_collections=50,maxttl=10,num_indexes=25,pc=10,index_nodes=0,cbas_nodes=1,fts_nodes=0,ops_rate=200000,ramQuota=68267,doc_ops=create:update:delete:read,mutation_perc=100,rebl_ops_rate=50000,key_type=RandomKey -m rest'

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

MB-52490_nodein.png
476 kB
13/Jun/22 7:08 AM
Screen Shot 2022-06-10 at 1.17.27 PM.png
407 kB
10/Jun/22 3:46 PM
Screen Shot 2022-06-10 at 3.47.22 PM.png
138 kB
10/Jun/22 3:47 PM
Screen Shot 2022-06-10 at 3.48.17 PM.png
276 kB
10/Jun/22 3:48 PM

Issue Links

depends on

MB-52807 Improve KV Backfill QoS & efficiency

Open

is duplicated by

MB-52574 [30TB, 1% KV DGM, FTS]: No progress in data movement during rebalance in of one KV node since 2+ hours of rebalance start.

Closed

is triggering

MB-52807 Improve KV Backfill QoS & efficiency

Open

relates to

MB-44562 [System Test]: KV rebalance hung for default bucket

Closed

MB-51950 [Magma] Magma bucket not honouring RAM quota allocated when analytics is ingesting data from it.

Closed

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews

For Gerrit Dashboard: MB-52490
#	Subject	Branch	Project	Status	CR	V
176234,1	MB-52490: Add BackfillManager::producer member	neo	kv_engine	Status: NEW	-1	+1
176236,5	MB-52490: Avoid that a Producer consumes all backfills.maxRunning slots	neo	kv_engine	Status: NEW	-1	-1
176424,6	MB-52490: Move Backfill Task to its own source files	neo	kv_engine	Status: NEW	0	+1
176712,8	MB-52490: Pass Producer to BackfillManagerTask	neo	kv_engine	Status: NEW	-1	-1
176802,5	MB-52490: Prevent that backfill-busy Producers block others	neo	kv_engine	Status: NEW	-1	-1

Activity

People

Assignee:: Daniel Owen

Reporter:: Ritesh Agarwal

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Dates

Created:: 09/Jun/22 11:46 AM

Updated:: 24/Oct/22 6:15 AM

Gerrit Reviews

There are 5 open Gerrit changes

MB-52490: Add BackfillManager::producer member

-1 Gerrit Review:
MB-52490: Avoid that a Producer consumes all backfills.maxRunning slots

-1 Gerrit Review:
MB-52490: Move Backfill Task to its own source files

+1 Gerrit Review:
MB-52490: Pass Producer to BackfillManagerTask

-1 Gerrit Review:
MB-52490: Prevent that backfill-busy Producers block others

-1 Gerrit Review:

Show There is 1 closed Gerrit change

Hide There is 1 closed Gerrit change

MB-52490: WIP: Gerrit Review:

[30TB, 1% KV DGM, CBAS]: Rebalance in 1 KV node is stuck since 35 hours. No movement in data/vBuckets.

Details

Description

Attachments

Attachments

Issue Links

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty