Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-51338

GetRangeSplitItems can return larger number of items than requested

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • No

    Description

      Operations like PersistAll operate with a concurrency parameter which is used to split skiplist into range partitions. For concurrency=n, first n wctx are allocated in an array and then n range partitions are generated using GetRangeSplitItems. If GetRangeSplitItems results in >n range partitions, we can run into index out of range panic.

      === RUN   TestShardRecoveryRecoveryLogCorruption
      ----------- running TestShardRecoveryRecoveryLogCorruption
      ....
      Shard shards/shard1(1) : instance test.default.TestShardRecoveryRecoveryLogCorruption_19 started
      Loading data...
      Plasma: Plasma.monitor: Starting monitorpanic: runtime error: index out of range [1] with length 1goroutine 491912 [running]:
      panic(0x7e9a40, 0xc000e761f8)
          /home/buildbot/.cbdepscache/exploded/x86_64/go-1.16.5/go/src/runtime/panic.go:1065 +0x565 fp=0xc000065e18 sp=0xc000065d50 pc=0x43b3e5
      runtime.goPanicIndex(0x1, 0x1)
          /home/buildbot/.cbdepscache/exploded/x86_64/go-1.16.5/go/src/runtime/panic.go:88 +0xa5 fp=0xc000065e60 sp=0xc000065e18 pc=0x438585
      github.com/couchbase/plasma.(*Plasma).PersistAll2.func1(0x7f412c8efcb0, 0x1, 0xc000212000, 0xffffffffffffffff, 0xc000a30f28, 0x47d70b)

       

      GetRangeSplitItems can return >n items if more skiplist nodes are added to the chosen level while GetRangeSplitItems is still running. In nitro this is handled correctly. In plasma, this must be fixed by making sure callback passed to PageVisitor can handle the extra range partitions - by allocating wctx on the fly if more range partitions are returned.

      Attachments

        For Gerrit Dashboard: MB-51338
        # Subject Branch Project Status CR V

        Activity

          This is easily reproducible with unit test and code instrumentation (adding sleep in GetRangeSplitItems). The issue is rare and not easy to run into.

          akhil.mundroy Akhil Mundroy added a comment - This is easily reproducible with unit test and code instrumentation (adding sleep in GetRangeSplitItems). The issue is rare and not easy to run into.
          jliang John Liang added a comment -

          If not fixed, indexer will crash. The fix is just 1 line, so it is better to fix it now, then risking it. Even though the issue is not common, but probability of crash could increase with a lot of indexes on small machines.

          jliang John Liang added a comment - If not fixed, indexer will crash. The fix is just 1 line, so it is better to fix it now, then risking it. Even though the issue is not common, but probability of crash could increase with a lot of indexes on small machines.

          Build couchbase-server-7.1.0-2453 contains plasma commit b7bfc6b with commit message:
          MB-51338: Avoid getting more partitions than requested in GetRangePartitions

          build-team Couchbase Build Team added a comment - Build couchbase-server-7.1.0-2453 contains plasma commit b7bfc6b with commit message: MB-51338 : Avoid getting more partitions than requested in GetRangePartitions

          Build couchbase-server-7.2.0-1011 contains plasma commit b7bfc6b with commit message:
          MB-51338: Avoid getting more partitions than requested in GetRangePartitions

          build-team Couchbase Build Team added a comment - Build couchbase-server-7.2.0-1011 contains plasma commit b7bfc6b with commit message: MB-51338 : Avoid getting more partitions than requested in GetRangePartitions

          People

            akhil.mundroy Akhil Mundroy
            akhil.mundroy Akhil Mundroy
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty