Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Fixed
Priority: Critical
Fix Version/s: 7.6.2
Affects Version/s: 7.2.2
Component/s: fts
Labels:
- candidate-for-trinity

Triage:
Untriaged
Story Points:
0
Is this a Regression?:
Unknown

Description

A recent change (MB-57334)in the rebalance code path added a logic to early skip rebalance in case FTS topology didn't change.

We realised that there are certain situations where we don't want to skip rebalance, even if FTS topology remains unchanged.

Example:

Missing partitions due to previous failover operations ( Addressed ~~MB-58450~~)
server group change for FTS nodes.

To avoid skipping rebalance in case server group changes for FTS node(s), we will have to track NodeDefs across rebalances.
As of now, we are only tracking NodeUUIDs across rebalances.

keeping track of NodeUUIDs/NodeDefs at the time of last successful rebalance enable us to reconcile the state of latest NodeDefs as compared to prevNodeDefs (from last rebalance). Based on which we can decide whether to skip rebalance or not.

Prev attempt to solve this: https://review.couchbase.org/c/cbgt/+/197432

summary of comments on the PR:

On all nodes, we can track prevNodeDefs ( NodeDefs snapshot after last successful rebalance).
After every successful rebalance, we need to update prevNodeDefs on all the nodes. Only orchestrator node knows that rebalance completed successfully, it need some mechanism to let other nodes also.
- One way to achieve this using metakv. After a successful rebalance, orchestrator can notify other nodes (via metakv) to indicate that they need to update their local copy of prevNodeDefs.

We also want to verify that the current mechanism to keep track of prevNodeUUIDs is correct.

Attachments

Issue Links

relates to

MB-61043 Partition layout skew after failover(s) + rebalance; must not skip following rebalance ops in case of a skew

Closed

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Shaad Khan

Reporter:: Shaad Khan

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 11/Oct/23 12:12 AM

Updated:: 30/Apr/24 6:34 AM

Resolved:: 30/Apr/24 6:33 AM

Gerrit Reviews

There are no open Gerrit changes

Don't skip rebalance if there is a container change for any fts node.

Details

Description

Attachments

Issue Links

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty