Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-62000

Warmup can write data into the wrong vbucket (correct ID, just wrong object)

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • 6.0.0, 6.0.1, 6.0.2, 6.0.3, 6.0.4, 6.0.5, 6.5.1, 6.6.0, 6.6.1, 6.6.2, 6.5.2, 6.5.0, 7.6.0, 6.6.3, 6.6.4, 6.6.5, 6.6.6, 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.1.4, 7.0.5, 7.1.0, 7.1.1, 7.1.2, 7.2.0, 7.1.3, 7.2.1, 7.1.5, 7.2.4, 7.2.2, 7.1.6, 7.2.3, 7.2.5, 7.6.1
    • couchbase-bucket
    • None
    • Untriaged
    • 0
    • Unknown

    Description

      Warmup uses a pause resume pattern, when paused it has no reference on the vbucket object it is loading data into. This issue tracks that there is no protection against the vbucket object being replaced whilst paused.

      E.g.

      • T1: Warmup creates vb:0
      • T2: Warmup reached KeyDump phase, certain operations are now unblocked e.g. resetVB/setVBState
      • T3: Warmup obtains vb:0 and loads keys and pauses -> vb:0 released
      • T4: vb:0 is replaced, e.g. set(dead) set(replica/active/pending)
      • T5: Warmup resumes obtains vb:0 and loads keys into the new vb:0

      The final state is that vb:0 now contains a mix of data from warmup and later maybe from DCP.

      Note that there's no evidence that step T4 fully occurs, we do see cases where a vbucket is deleted during warmup but, we didn't see it recreated.

      The correct solution here is for Warmup to use a weak_ptr<VBucket> rather than asking for a shared_ptr<VBucket> as it progresses through the phases. When Warmup fails to lock the weak_ptr it can now ignore that vbucket.

      This issue is certainly more risk now that MB-9418 is complete, we're very likely to see vbuckets created during the secondary warmup.

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-62000
          # Subject Branch Project Status CR V

          Activity

            People

              jwalker Jim Walker
              jwalker Jim Walker
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:

                Gerrit Reviews

                  There is 1 open Gerrit change

                  PagerDuty