Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-56744

[CBSE] Restart timestamp is greater than the previous computed timestamp while rolling back

    XMLWordPrintable

Details

    • Untriaged
    • 0
    • Unknown

    Description

      ================================================================

       

      1. The problem starts with indexer skipping the build completion disk snapshot due to MB-46797.

      2. The last disk snapshot created before the build completion (2 index, 3 partitions)

      2023-03-18T17:20:50.439-07:00 [Info] plasmaSlice::OpenSnapshot SliceId 0 IndexInstId 17574301434787230136 PartitionId 4 Creating New Snapshot SnapshotInfo: count:513621127 committed:true
      2023-03-18T17:20:50.439-07:00 [Info] PlasmaSlice Slice Id 0, IndexInstId 17574301434787230136, PartitionId 4 Creating recovery point ...
      2023-03-18T17:20:50.466-07:00 [Info] plasmaSlice::OpenSnapshot SliceId 0 IndexInstId 3887017921767071747 PartitionId 2 Creating New Snapshot SnapshotInfo: count:514303824 committed:true
      2023-03-18T17:20:50.466-07:00 [Info] PlasmaSlice Slice Id 0, IndexInstId 3887017921767071747, PartitionId 2 Creating recovery point ...
      2023-03-18T17:20:50.466-07:00 [Info] plasmaSlice::OpenSnapshot SliceId 0 IndexInstId 3887017921767071747 PartitionId 5 Creating New Snapshot SnapshotInfo: count:513387893 committed:true
      2023-03-18T17:20:50.466-07:00 [Info] PlasmaSlice Slice Id 0, IndexInstId 3887017921767071747, PartitionId 5 Creating recovery point ...

      3. One index/partition finishes

      2023-03-18T17:25:21.930-07:00 [Info] PlasmaSlice Slice Id 0, IndexInstId 17574301434787230136, PartitionId 4 Created recovery point (took 4m31.491268326s)
      2023-03-18T17:25:21.933-07:00 [Info] PlasmaSlice Slice Id 0, IndexInstId 17574301434787230136, PartitionId 4 Cleanup mainstore recovery point SnapshotInfo: count:498180407 committed:false. num RPs 3.
      

      4. Next set of disk snapshots get created (build completion disk snapshots).

      2023-03-18T17:25:52.715-07:00 [Info] plasmaSlice::OpenSnapshot SliceId 0 IndexInstId 17574301434787230136 PartitionId 4 Creating New Snapshot SnapshotInfo: count:528652160 committed:true
      2023-03-18T17:25:52.715-07:00 [Info] plasmaSlice::OpenSnapshot SliceId 0 IndexInstId 3887017921767071747 PartitionId 5 Creating New Snapshot SnapshotInfo: count:528416182 committed:true
      2023-03-18T17:25:52.717-07:00 [Info] plasmaSlice::OpenSnapshot SliceId 0 IndexInstId 3887017921767071747 PartitionId 2 Creating New Snapshot SnapshotInfo: count:529367068 committed:true
      

      5. Only one disk snapshot actually gets initiated but 2 get skipped as the disk snapshot triggered in Step2 is still in progress.

      2023-03-18T17:25:52.716-07:00 [Info] PlasmaSlice Slice Id 0, IndexInstId 17574301434787230136, PartitionId 4 Creating recovery point ...
      2023-03-18T17:25:52.715-07:00 [Info] PlasmaSlice Slice Id 0, IndexInstId 3887017921767071747, PartitionId 5 Skipping ondisk snapshot. A snapshot writer is in progress.
      2023-03-18T17:25:52.717-07:00 [Info] PlasmaSlice Slice Id 0, IndexInstId 3887017921767071747, PartitionId 2 Skipping ondisk snapshot. A snapshot writer is in progress.
      

      6. The pending disk snapshots from Step2 finish.

      2023-03-18T17:26:02.935-07:00 [Info] PlasmaSlice Slice Id 0, IndexInstId 3887017921767071747, PartitionId 2 Created recovery point (took 5m12.468437027s)
      2023-03-18T17:26:02.938-07:00 [Info] PlasmaSlice Slice Id 0, IndexInstId 3887017921767071747, PartitionId 2 Cleanup mainstore recovery point SnapshotInfo: count:498844929 committed:false. num RPs 3.
      2023-03-18T17:26:59.286-07:00 [Info] PlasmaSlice Slice Id 0, IndexInstId 3887017921767071747, PartitionId 5 Created recovery point (took 6m8.819277349s)
      2023-03-18T17:26:59.289-07:00 [Info] PlasmaSlice Slice Id 0, IndexInstId 3887017921767071747, PartitionId 5 Cleanup mainstore recovery point SnapshotInfo: count:497955619 committed:false. num RPs 3.
      

      7. The disk snapshot in Step5 gets created.

      2023-03-18T17:27:59.523-07:00 [Info] PlasmaSlice Slice Id 0, IndexInstId 17574301434787230136, PartitionId 4 Created recovery point (took 2m6.807509695s)
      2023-03-18T17:27:59.527-07:00 [Info] PlasmaSlice Slice Id 0, IndexInstId 17574301434787230136, PartitionId 4 Cleanup mainstore recovery point SnapshotInfo: count:513621127 committed:false. num RPs 3.
      

      8. Now the skipped disk snapshot would only get created if there are more mutation in the bucket, which doesn't happen in this case.

      9. When indexer restarts, it computes the minimum seqnum for all vbuckets to restart DCP stream. It can be seen that some of the vbuckets are lower than the snapEnd.

      2023-03-18T20:20:21.568-07:00 [Info] Indexer::startBucketStream Stream: MAINT_STREAM Bucket: MLEhist
      {vbno, vbuuid, seqno, snapshot-start, snapshot-end}
      { 72 3150111d5795 5766646 0 8700722}
      { 80 f4943cd4534f 6140788 0 8697218} 
      { 96 d6c0afb9feb1 6069126 0 8675696} 
      { 158 fac37da0dcc8 3830175 0 8699706} 
      { 195 4190b0fe34f0 4255513 0 8683042}
      

      10. This leads to rollback to 0 as the KV purge_seqno is already higher.

      2023-03-18T20:20:22.612-07:00 [Warn] Timekeeper::handleStreamBegin StreamBegin rollback for StreamId MAINT_STREAM Bucket MLEhist vbucket 72. Rollback (0, 3150111d5795).
      2023-03-18T20:20:22.618-07:00 [Warn] Timekeeper::handleStreamBegin StreamBegin rollback for StreamId MAINT_STREAM Bucket MLEhist vbucket 80. Rollback (0, f4943cd4534f).
      2023-03-18T20:20:22.637-07:00 [Warn] Timekeeper::handleStreamBegin StreamBegin rollback for StreamId MAINT_STREAM Bucket MLEhist vbucket 96. Rollback (0, d6c0afb9feb1).
      2023-03-18T20:20:23.209-07:00 [Warn] Timekeeper::handleStreamBegin StreamBegin rollback for StreamId MAINT_STREAM Bucket MLEhist vbucket 158. Rollback (0, fac37da0dcc8).
      2023-03-18T20:20:23.223-07:00 [Warn] Timekeeper::handleStreamBegin StreamBegin rollback for StreamId MAINT_STREAM Bucket MLEhist vbucket 166. Rollback (0, ae4aed62ffd3).
      2023-03-18T20:20:23.491-07:00 [Warn] Timekeeper::handleStreamBegin StreamBegin rollback for StreamId MAINT_STREAM Bucket MLEhist vbucket 195. Rollback (0, 4190b0fe34f0).
      

      11. Storage Manager uses all disk snapshots to try to start DCP streams in case of rollback to 0. First it uses the latest snapshots.

      2023-03-18T20:20:34.200-07:00 [Info] StorageMgr::handleRollback latestSnapInfo SnapshotInfo: count:533326858 committed:false lastRollbackTs <nil>. Use latest snapshot.
      2023-03-18T20:20:45.892-07:00 [Info] StorageMgr::handleRollback Rollback Index: 17574301434787230136 PartitionId: 4 SliceId: 0 To Snapshot SnapshotInfo: count:533326858 committed:false
      2023-03-18T20:21:01.009-07:00 [Info] StorageMgr::handleRollback Rollback Index: 4260340739290744432 PartitionId: 3 SliceId: 0 To Snapshot SnapshotInfo: count:711175443 committed:false
      2023-03-18T20:21:07.610-07:00 [Info] StorageMgr::handleRollback Rollback Index: 8324473573058655730 PartitionId: 5 SliceId: 0 To Snapshot SnapshotInfo: count:711219055 committed:false
      2023-03-18T20:21:15.218-07:00 [Info] StorageMgr::handleRollback Rollback Index: 3887017921767071747 PartitionId: 2 SliceId: 0 To Snapshot SnapshotInfo: count:529367068 committed:false
      2023-03-18T20:21:21.131-07:00 [Info] StorageMgr::handleRollback Rollback Index: 3887017921767071747 PartitionId: 5 SliceId: 0 To Snapshot SnapshotInfo: count:528416182 committed:false
      2023-03-18T20:21:22.821-07:00 [Info] Indexer::startBucketStream Stream: MAINT_STREAM Bucket: MLEhist
      { 72 3150111d5795 5766646 0 8700722}
      { 80 f4943cd4534f 6140788 0 8697218}
      { 96 d6c0afb9feb1 6069126 0 8675696}
      

      12. Then it tries with the next older snapshots

      2023-03-18T20:21:35.422-07:00 [Info] StorageMgr::handleRollback Discarding Already Used Snapshot SnapshotInfo: count:529367068 committed:false. Using Next snapshot SnapshotInfo: count:514303824 committed:false
      2023-03-18T20:21:57.137-07:00 [Info] StorageMgr::handleRollback Discarding Already Used Snapshot SnapshotInfo: count:528416182 committed:false. Using Next snapshot SnapshotInfo: count:513387893 committed:false
      2023-03-18T20:22:03.031-07:00 [Info] StorageMgr::handleRollback Discarding Already Used Snapshot SnapshotInfo: count:533326858 committed:false. Using Next snapshot SnapshotInfo: count:528652160 committed:false
      2023-03-18T20:22:09.011-07:00 [Info] StorageMgr::handleRollback Discarding Already Used Snapshot SnapshotInfo: count:711175443 committed:false. Using Next snapshot SnapshotInfo: count:711175443 committed:false
      2023-03-18T20:22:16.733-07:00 [Info] StorageMgr::handleRollback Discarding Already Used Snapshot SnapshotInfo: count:711219055 committed:false. Using Next snapshot SnapshotInfo: count:711219055 committed:false
      

      13. But this time, it doesn't compute the restart timestamp correctly

      2023-03-18T20:22:25.364-07:00 [Info] Indexer::startBucketStream Stream: MAINT_STREAM Bucket: MLEhist
      { 72 3150111d5795 8700722 8700722 8700722}
      { 80 f4943cd4534f 8697218 8697218 8697218}
      { 96 d6c0afb9feb1 8675696 8675696 8675696}
      

      The minimum timestamp must have been lower than the one used in previous try. But it is actually greater. This leads to the rolled back mutations never being streamed back from DCP.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            sai.teja Sai Krishna Teja
            sai.teja Sai Krishna Teja
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty