Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-58094

[RECOVERY] Data is not always transferred concurrently

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 7.6.0
    • 7.1.0, 7.2.0
    • tools
    • Untriaged
    • 0
    • Unknown

    Description

      What is the problem?
      cbdatarecovery does not transfer vBuckets concurrently (if there are more than 4096 documents per vBucket), no matter what threads is set to.

      For every source we loop over the data ranges (i.e. one vBucket at a time) and call Document on the callbacks for every document we have. In the couchbase sink this method queues the document into the channel for the worker for that vBucket. This will ultimately be blocking, although by default the channel does have a buffer of 4096.

      In the archive source we loop over the data ranges in parallel which means we will hit more than one worker at the same time, giving us concurrency. In the recovery source we do it sequentially which due to the blocking nature of sending to the channel means we never transfer concurrently.

      This was introduced when we made it so each worker only transferred a subset of vBuckets (MB-37023).

      What is the fix?
      We should use a worker pool to loop over the dataranges.

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-58094
          # Subject Branch Project Status CR V

          Activity

            People

              gilad.kalchheim Gilad Kalchheim
              Matt.Hall Matt Hall
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty