Description
Say we have 2 go routines writing data into mossStore,
go routine 1: ExecuteBatch() with Segment X, takes snapshot ( X ) & calls store.Persist( X )
go routine 2: ExecuteBatch() with Segment Y, takes snapshot (X+Y) & calls store.Persist(X+Y)
Now go routine 1's Persist ( X ) is called first grabs the store lock, builds new store footer and releases the store lock here
https://github.com/couchbase/moss/blob/master/store.go#L136
then goes ahead and issues I/O to actually persist the segment ( X )
go routine 2 calls Persist (X+Y) grabs the lock released by routine 1, builds store footer footer and is now persisting segment ( X + Y).
Now due to I/O race, go routine 2's write somehow completes first and it manages to grab the second lock here
https://github.com/couchbase/moss/blob/master/store.go#L159
Now the footer has X + Y and everything is good.. until...
go routine 1 completes its I/O and writes out its footer which only has segment X.
This results in data loss of segment Y in case there is a shutdown or crash immediately after as far as go routine 2 is concerned.
Simple fix might just be to hold on to the store lock until the footer is persisted, but need to evaluate the side effects on things like snapshots etc.
thanks
Attachments
For Gerrit Dashboard: MB-23665 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
76155,2 | MB-23665 - update Persist() API docs, comments & whitespace | master | moss | Status: MERGED | +2 | +1 |