Details
-
Bug
-
Resolution: Fixed
-
Critical
-
6.6.0
-
Triaged
-
1
-
Unknown
-
KV-Engine Sprint 2020-Dec, KV 2021-Oct-21, KV 2021-Nov
Description
While looking at MB-43176 I found that we were incrementing expiry stats for unsuccessful expirations (i.e if VBucket::processExpiredItem returned something other than a "success" response). Whilst trying to find a unit test that attempts to process an unsuccessful expiration I instead found that expirations are successful even if we have an in flight prepare for the same key. This expiration then generates a new seqno/item for persistence.
On the active node this probably doesn't have much impact unless we're warming up a partial snapsnot (in which case we won't warm up the prepare as it has now been "completed", this would be a data loss scenario anyway it's just slightly worse now) .
On the replica node this appears to cause us to return the error back to the active node when we process the expiration as we have an in flight SyncWrite. As far as I can see the active should tear down the connection at this point. Restreaming the expiration from disk to the replica would "implicitly" complete the prepare. A subsequent commit would then fail with ENOENT which I think is returned to the active and causes another disconnect. A subsequent disk stream would then work correctly as far as I can tell.
Going to write a cluster test to verify the replica side of things.
Attachments
For Gerrit Dashboard: MB-43238 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
142165,8 | MB-43238: Take a FindUpdateResult in VB::processExpiredItem | master | kv_engine | Status: MERGED | +2 | +1 |
142269,5 | MB-43238: Don't expire committed items if prepare exists | master | kv_engine | Status: MERGED | +2 | +1 |
163884,2 | MB-43238: Fix intermittent failure of expiry pager settings test | master | kv_engine | Status: MERGED | +2 | +1 |
164306,4 | MB-43238: [BP] Take a FindUpdateResult in VB::processExpiredItem | 6.6.3 | kv_engine | Status: MERGED | +2 | +1 |
164307,5 | MB-43238: [BP] Don't expire committed items if prepare exists | 6.6.3 | kv_engine | Status: MERGED | +2 | +1 |
164399,6 | Adding test for doc expired before commit timed out | mad-hatter | TAF | Status: MERGED | +2 | +1 |
164465,3 | Adding test for doc expired before commit timed out | cheshire-cat | TAF | Status: MERGED | +2 | +1 |
164468,3 | Adding test for doc expired before commit timed out | master | TAF | Status: MERGED | +2 | +1 |
164552,2 | MB-43238: Merge branch '6.6.3' into mad-hatter | mad-hatter | kv_engine | Status: MERGED | +2 | +1 |
164554,3 | MB-43238: Merge branch '6.6.3' into mad-hatter | mad-hatter | kv_engine | Status: MERGED | +2 | +1 |
165867,1 | Merge branch 'mad-hatter' into cheshire-cat | cheshire-cat | kv_engine | Status: MERGED | +2 | +1 |
165992,1 | Merge branch 'mad-hatter' into cheshire-cat | cheshire-cat | kv_engine | Status: MERGED | +2 | +1 |
166323,1 | Merge branch 'cheshire-cat' | master | kv_engine | Status: MERGED | +2 | +1 |
166343,1 | Merge branch 'cheshire-cat' | master | kv_engine | Status: MERGED | +2 | +1 |