Details
-
Bug
-
Resolution: Fixed
-
Critical
-
6.5.0
-
Triaged
-
Unknown
Description
Currently on a topology change we reset the high prepared seqno of the active node. This happens because of the code here http://src.couchbase.org/source/xref/trunk/kv_engine/engines/ep/src/durability/active_durability_monitor.cc#691-692. Believe this is also an issue if we do an active->replica/pending issue where the new passiveDM will not have a correct HPS.
Below test case is a simple reproduction.
TEST_P(ActiveDurabilityMonitorTest, HPSResetOnTopologyChange) {
|
// To start, we have 1 chain with active and replica1
|
addSyncWrite(1);
|
{
|
SCOPED_TRACE("");
|
assertNumTrackedAndHPS(1, 1);
|
}
|
|
// Should commit on replica1 ack.
|
testSeqnoAckReceived(replica1,
|
1 /*ackSeqno*/,
|
0 /*expectedNumTracked*/,
|
1 /*expectedLastWriteSeqno*/,
|
1 /*expectedLastAckSeqno*/);
|
{
|
SCOPED_TRACE("");
|
assertNumTrackedAndHPS(0, 1);
|
}
|
|
// Add the secondChain with the new node
|
EXPECT_NO_THROW(getActiveDM().setReplicationTopology(
|
nlohmann::json::array({{active, replica1}, {active, replica2}})));
|
|
// Still committed
|
{
|
SCOPED_TRACE("");
|
// ************************
|
// Test fails because HPS is now 0
|
// ************************
|
assertNumTrackedAndHPS(0, 1);
|
}
|
}
|