Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
6.6.2
-
Security Level: Public
-
Untriaged
-
1
-
Unknown
Description
Summary
During a Data node failure due to OOM-kill, GSI indexer received a snapshot marker where the snapStart was less than the snapEnd of the previous, complete Snapshot.
This breaks invariants assumed by DCP consumers (indexer here), if the consumer is disconnected and has to reconnect + streamRequest to resume then they may be told to rollback unnecessarily.
Steps To Reproduce
<TBD>
Details
- Data service node-d5 failed over around 2022-07-17T01:07.
- Index nodes started receiving StreamEnd messages around 2022-07-17T01:01:12 (around the time memcached process got OOM killed on node-d5)
2022-07-17T01:01:12.492+07:00 [Info] TK StreamEnd MAINT_STREAM bucket-product 1012 146870309391523 1810315
- These StreamEnd messages were generated by projector, once TCP connection to memcached got closed:
2022-07-17T01:01:12.570+07:00 [Error] DCPT[secidx:proj-bucket-product-MAINT_STREAM_TOPIC_96b58eb2e82a7cf57d51a9ae0637625d-12990437825450808209/1] doReceive(): read tcp 10.26.5.17:16919->10.26.5.17:11210: read: connection reset by peer
- Indexer attempted to re-establish DCP connection by following its DCP repair mechanisms. Initial attempts failed as memcached on node-d5 was not available.
2022-07-17T01:06:15.078+07:00 [Warn] Slow/Hung Operation: KVSender::sendRestartVbuckets did not respond for 5m0.701959279s for projector node-d5:9999 topic MAINT_STREAM_TOPIC_2f75a165518bf54fd561760749a827a1 bucket bucket-productorder
- After node-d5 node failover, Indexer was able to re-negotiate the DCP connection for disconnected vbuckets. But this time, DCP responded that on some of the vbuckets, a rollback to a lower seqno was required.
2022-07-17T01:11:33.746+07:00 [Warn] Timekeeper::handleStreamBegin StreamBegin rollback for StreamId MAINT_STREAM Bucket bucket-product vbucket 687. Rollback
(1803022, d77d8839c0a0).
- Indexer then checked if the 2 available disk snapshots had seqno lower than/equal to what DCP had requested.
2022-07-17T01:24:44.816+07:00 [Info] SnapshotContainer::GetOlderThanTS Returning nil as no matching snapshot found
- As both the disk snapshots didn’t satisfy this requirement, indexer decided to rollback to zero.
2022-07-17T01:25:02.391+07:00 [Info] StorageMgr::rollbackAllToZero MAINT_STREAM bucket-product
Detailed analysis of specific vBucket
Taking vb 1012 for bucket bucket-product as example on index node node-i1:
- Indexer snapshots have seqno=1810315 for vb 1012 in both on-disk snapshots:
2022-07-17T01:24:44.762+07:00 [Info] SnapshotContainer::GetOlderThanTS Comparing with snapshot SnapshotInfo: count:71747364 committed:false, snapTs = [687=1803028, 694=1797300, 754=1810066, 760=1803299, 789=1800879, 797=1804008, 798=1802954, 829=1804326,1012=1810315}
2022-07-17T01:24:44.807+07:00 [Info] SnapshotContainer::GetOlderThanTS Comparing with snapshot SnapshotInfo: count:71349292 committed:false, snapTs =[687=1803028, 694=1797300, 754=1810066, 760=1803299, 789=1800879, 797=1804008, 798=1802954, 829=1804326,1012=1810315}
- Also indexer in-memory bookkeeping has the following DCP snapshot information for vb 1012
{seqno:1810315 snapStart:1810045 snapEnd:1815372}
2022-07-17T01:06:32.994+07:00 [Error] Timekeeper::sendRestartMsg Error Response from KV feed.feeder For Request
Message: MsgRestartVbuckets
StreamId: MAINT_STREAM
Bucket: bucket-product
SessionId: 1
RestartTS: bucket: bucket-product, vbuckets: 113 Crc64: 0 snapType NO_SNAP -
{vbno, vbuuid, seqno, snapshot-start, snapshot-end}
...
{ 1012 8593e89e8ca3 1810315 1810045 1815372}
- One important thing to note is that indexer can only create storage snapshots if it has receivd complete DCP snapshots for all vbuckets ie. seqno==snapEnd.
So this combination cannot be present in storage snapshots(which are used for rollback):{seqno:1810315 snapstart:1810045 snapend:1815372}
And it is the in-memory bookkeeping state.
- This can be confirmed from the indexer stats. "num_nonalign_ts" tracks how many non snapshot aligned(i.e. seqno != snapEnd) timestamps were generated. It is 0 for all the buckets.
2022-07-17T00:59:30.990+07:00 [Info] PeriodicStats = "bucket-product:num_nonalign_ts":0
2022-07-17T01:23:31.087+07:00 [Info] PeriodicStats = "bucket-product:num_nonalign_ts":0
- One possible explanation here is that the index storage snapshot has something like this:
{ 1012 8593e89e8ca3 1810315 1810315 1810315}
And then later DCP sent snapshot with:
{snapStart:1810045 snapEnd:1815372}
but no more mutations.
This leaves indexer in-memory bookkeeping state as(which is what it is asking DCP to restart with):{ 1012 8593e89e8ca3 1810315 1810045 1815372}
This kind of snapshot generation where snapStart of the next snapshot is lower than the snapEnd of previous snapshot can lead to such a behavior.
- Indexer in-memory book-keeping shows that MAINT_STREAM is caught up with DCP(before rollback) and DCP snapshots are either 1 or 2 in size for all vbuckets (except 9)
2022-07-17T01:24:44.753+07:00 [Info] Timekeeper::initiateRecovery StreamId MAINT_STREAM Bucket bucket-product SessionId 1 RestartTs bucket: bucket-product, vbuckets: 1024 Crc64: 5159790531118676693 snapType INMEM_SNAP -
{vbno, vbuuid, seqno, snapshot-start, snapshot-end}
{ 0 b46f8737a9ac 1815031 1815031 1815031}
{ 1 e93fec117803 1806674 1806674 1806674}
{ 2 7a12c129ca3a 1811929 1811929 1811929}
{ 3 efa0c55a52b3 1812471 1812471 1812471}
{ 4 d4307a6486a6 1808308 1808308 1808308}
{ 5 85b7d45a8923 1815052 1815052 1815052}
{ 6 14282371d3ff 1815735 1815735 1815735}
{ 7 6de2b44ffabf 1815739 1815739 1815739}
And for all the 9 vbuckets which have large size snapshots, all of those did get rollback from DCP(all other vbuckets worked fine)
{ 687 d77d8839c0a0 1803028 1803022 1804857}
{ 694 e5916446b3e9 1797300 1796844 1797777}
{ 754 94678efbcc4a 1810066 1809611 1810196}
{ 760 4372be2b28da 1803299 1803297 1803595}
{ 789 b5614fa86ea1 1800879 1800853 1801123}
{ 797 ee1cd1e056e 1804008 1803946 1804893}
{ 798 f6798a8a6f9e 1802954 1802684 1806442}
{ 829 75b47a9e6803 1804326 1804245 1804705}
{ 1012 8593e89e8ca3 1810315 1810045 1815372}
2022-07-17T01:11:33.746+07:00 [Warn] Timekeeper::handleStreamBegin StreamBegin rollback for StreamId MAINT_STREAM Bucket bucket-product vbucket 687. Rollback (1803022, d77d8839c0a0).
2022-07-17T01:11:34.221+07:00 [Warn] Timekeeper::handleStreamBegin StreamBegin rollback for StreamId MAINT_STREAM Bucket bucket-product vbucket 694. Rollback (1796843, e5916446b3e9).
2022-07-17T01:14:52.227+07:00 [Warn] Timekeeper::handleStreamBegin StreamBegin rollback for StreamId MAINT_STREAM Bucket bucket-product vbucket 1012. Rollback (1810045, 8593e89e8ca3).
2022-07-17T01:15:19.652+07:00 [Warn] Timekeeper::handleStreamBegin StreamBegin rollback for StreamId MAINT_STREAM Bucket bucket-product vbucket 760. Rollback (1803296, 4372be2b28da).
2022-07-17T01:15:19.655+07:00 [Warn] Timekeeper::handleStreamBegin StreamBegin rollback for StreamId MAINT_STREAM Bucket bucket-product vbucket 754. Rollback (1809611, 94678efbcc4a).
2022-07-17T01:15:29.239+07:00 [Warn] Timekeeper::handleStreamBegin StreamBegin rollback for StreamId MAINT_STREAM Bucket bucket-product vbucket 789. Rollback (1800853, b5614fa86ea1).
2022-07-17T01:15:29.260+07:00 [Warn] Timekeeper::handleStreamBegin StreamBegin rollback for StreamId MAINT_STREAM Bucket bucket-product vbucket 797. Rollback (1803945, ee1cd1e056e).
2022-07-17T01:15:29.261+07:00 [Warn] Timekeeper::handleStreamBegin StreamBegin rollback for StreamId MAINT_STREAM Bucket bucket-product vbucket 798. Rollback (1802684, f6798a8a6f9e).
2022-07-17T01:15:37.939+07:00 [Warn] Timekeeper::handleStreamBegin StreamBegin rollback for StreamId MAINT_STREAM Bucket bucket-product vbucket 829. Rollback (1804244, 75b47a9e6803).
- After failover, indexer requested the new vb master to start the DCP stream with the following information
{seqno:1810315 snapstart:1810045 snapend:1815372}
However, the new vb master (node-d5, earlier vb-replica) was behind the indexer and asked the indexer to rollback.
2022-07-17T01:14:52.197896+07:00 WARNING 1817: (bucket-product) DCP (Producer) eq_dcpq:secidx:proj-bucket-product-MAINT_STREAM_TOPIC_2f75a165518bf54fd561760749a827a1-1990393963162632936/0 - (vb:1012) Stream request requires rollback to seqno:1810045 because consumer ahead of producer - producer upper at 1815371. Client requested seqnos:{1810315,18446744073709551615} snapshot:{1810045,1815372} uuid:146870309391523
The seqno of rollback seqno:1810045 is lower than both index disk snapshots (seqno:1810315). This lead to indexer deciding to rollback to zero.
Attachments
Issue Links
For Gerrit Dashboard: MB-53057 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
177666,2 | MB-53057 Use all disk snapshots on DCP rollback | neo | indexing | Status: MERGED | +2 | +1 |
177667,5 | MB-53057 Use all disk snapshots on DCP rollback | mad-hatter | indexing | Status: MERGED | +2 | +1 |
177668,5 | MB-53057 Check seq. order violation for snapshot messages | mad-hatter | indexing | Status: MERGED | +2 | +1 |
181850,6 | MB-53252 Use all disk snapshots on DCP rollback | cheshire-cat | indexing | Status: MERGED | +2 | +1 |