Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: 7.1.0
Affects Version/s: 4.5.0, 4.5.1
Component/s: couchbase-bucket
Labels:

Triage:
Triaged
Epic Link:
KV: Robust Rebalance
Is this a Regression?:
No

Description

On the replica side we accept items from DCP stream only if memory used is below replication_throttle_threshold (99%).

On a 2 node cluster with 1 replica, we can run into a situation where items are in memory on the active side of DCP each stream and ready to be sent to the replica side. But the replica side would refuse to take in any items because it has reached replication_throttle_threshold. (Note that, memory usage till replication_throttle_threshold is reached due to items in readyQ of DCP which are waiting to be sent to other side. The resident ratio is near 0%, i.e all items are paged out.) This can lead to an operational deadlock when we have active and replica on both nodes (it is so in our case).

Cursor dropping implemented in ~~MB-9897~~ handled the deadlock case only when the memory usage was due to items to be sent sitting on the checkpoint. Though it reduces the scope of deadlock, it does not completely solve the problem. We could have the same deadlock due to the items sitting on the readyQ of the active stream and thereby hogging the memory.

Attachments

Issue Links

blocks

MB-43801 Rebalance In is stuck at 94% while bucket is in ~1% DGM.

Closed

is duplicated by

MB-43761 Magma: Rebalance stuck at 4.60784313725%

Closed

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Daniel Owen

Reporter:: Manu Dhundi (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 08/Jun/16 6:22 PM

Updated:: 12/Jan/22 3:34 AM

Resolved:: 12/Jan/22 3:34 AM

Gerrit Reviews

There are no open Gerrit changes

Show There are 2 closed Gerrit changes

Hide There are 2 closed Gerrit changes

MB-19889: Test BackfillManager::scanBuffer limit on readyQ for Producer: Gerrit Review:

MB-19889: Test BackfillManager::buffer limit on readyQ for Producer: Gerrit Review:

Potential operational deadlock (livelock) during heavy load

Details

Description

Attachments

Issue Links

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty