Details
-
Bug
-
Resolution: Fixed
-
Major
-
6.5.0
-
None
-
Untriaged
-
No
-
KV-Engine MH 2nd Beta
Description
When a store operation requires a bgFetch, the store function runs twice.
E.g.
- store -> ewouldblock
- bg-fetch
- store -> return real status
Examples of stores behaving like this are:
- Full eviction ADD and cache-miss and bloom-filter miss (or off), we must bgFetch to be sure the key doesn't exist.
- Full eviction REPLACE and cache-miss and bloom-filter miss, we must bgFetch to see if the key does exist to process the replace
- Full eviction SET with CAS, this only requires a cache-miss as we must bgFetch to get the item metadata so the second run of the store can compare CAS
With the changes to durability all of these ADD/REPLACE/SET operations could be durable operations and some new code is now short-cutting the second part of the store, incorrectly returning success.
What happens is that the very first store operation that returns EWOULDBLOCK, will execute the tail of EventuallyPersistentEngine::storeIfInner, when the item is pending (a durable set), the EWOULDBLOCK is being interpreted incorrectly, the function thinks that the durable operation has been accepted and is pending async completion, when the operation has not been accepted, the EWOULDBLOCK is because a bg-fetch is scheduled.
- engine-specific is prematurely setup for the sync-write completion https://github.com/couchbase/kv_engine/blob/69044aeef5ef670fee9a3b74d739fd03e304990e/engines/ep/src/ep_engine.cc#L2484
Next when the bg-fetch completes, the store runs again this time we short-cut the operation and incorrectly return success (and a bogus cas).