Details
-
Bug
-
Resolution: Fixed
-
Major
-
6.0.0, 6.0.1, 6.0.2, 6.0.3, 6.0.4, 6.0.5, 6.5.1, 6.6.0, 6.6.1, 6.6.2, 6.5.2, 6.5.0, 7.6.0, 6.6.3, 6.6.4, 6.6.5, 6.6.6, 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.1.4, 7.0.5, 7.1.0, 7.1.1, 7.1.2, 7.2.0, 7.1.3, 7.2.1, 7.1.5, 7.2.4, 7.2.2, 7.1.6, 7.2.3, 7.2.5, 7.6.1
-
None
-
Untriaged
-
0
-
Unknown
Description
Warmup uses a pause resume pattern, when paused it has no reference on the vbucket object it is loading data into. This issue tracks that there is no protection against the vbucket object being replaced whilst paused.
E.g.
- T1: Warmup creates vb:0
- T2: Warmup reached KeyDump phase, certain operations are now unblocked e.g. resetVB/setVBState
- T3: Warmup obtains vb:0 and loads keys and pauses -> vb:0 released
- T4: vb:0 is replaced, e.g. set(dead) set(replica/active/pending)
- T5: Warmup resumes obtains vb:0 and loads keys into the new vb:0
The final state is that vb:0 now contains a mix of data from warmup and later maybe from DCP.
Note that there's no evidence that step T4 fully occurs, we do see cases where a vbucket is deleted during warmup but, we didn't see it recreated.
The correct solution here is for Warmup to use a weak_ptr<VBucket> rather than asking for a shared_ptr<VBucket> as it progresses through the phases. When Warmup fails to lock the weak_ptr it can now ignore that vbucket.
This issue is certainly more risk now that MB-9418 is complete, we're very likely to see vbuckets created during the secondary warmup.
Attachments
Issue Links
- relates to
-
MB-62083 Warmup closes and reopens vbucket datafiles across the various phases
- Open