Details
-
Bug
-
Resolution: Not a Bug
-
Major
-
None
-
7.0.0, 7.0.1, 7.0.2, 7.0.3
-
None
-
Untriaged
-
1
-
Unknown
-
KV 2021-Dec, KV 2022-Jan
Description
Consider the following examples.
First ignoring collections, maybe this is a situation from 6.6.
- vb:0 has 5 items and they are all deleted. so 5 mutations followed by 5 deletions
- vb:0 reports a high-seqno of 10
- vb:0 on disk only has seqs 5 - 10 all tombstones
- vb:0 now purges all tombstones
- KV never purges a tombstone if it is the high-seqno
- vb:0 on disk has 1 tombstone @ seqno 10
Next
- DCP client comes along, asks KV for the high-seq (getAllVBSeqnos), we return 10.
- DCP client does a stream request from start=0 to end=10
- KV sends 1 tombstone and closes the stream
Second example, now consider collections. Here we have two collections, default and c0
- vb:0 default collection has 5 items, all deleted
- vb:0 default collectoin has a high-seqno of 10
- vb:0 is then written to by c0
- vb:0 on disk has seqs 5 to 10 as tombstones and seqno 11 for c0
- vb:0 now purges all tombstones
- vb:0 on disk now only has seqno 11 for c0
Next, a legacy DCP client is used, they only know about the default collection
- DCP client comes along, asks KV for the high-seq (getAllVBSeqnos), we return 10 (high-seq of default)
- DCP client does a stream request from start=0 to end=10
- KV sends nothing
We need to figure out what really happens here, ActiveStream may go from backfilling to in-memory and then spot we've past the end-seqno and end the stream or we may hang.
Overall it may be simpler for us to retain the default collection high-seqno tombstone.
Note the second example, if it was a collection aware client the issue goes away as we can use seqno-advance to replace gaps and move the client forwards.