Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-53764

[CBM] Investigate how to further improve staging directory population performance

    XMLWordPrintable

Details

    • Task
    • Resolution: Fixed
    • Minor
    • 7.6.0
    • 7.1.0
    • tools
    • None
    • 1

    Description

      When looking into a forum issue around backups to S3 we noticed that the performance of populating the staging directory appeared to be an issue. There have been a couple of improvements recently (see linked MBs) but there could be gains still to be made.

      To give an idea of the severity of the problem: some testing on developer machines suggests that populating the staging directory (which includes listing all the files, downloading the ones we need and then unzipping metadata files) would take ~20 minutes for a repository with 365 backups in (i.e. a year's worth of daily backups). It's worth noting that this is an extreme case - we wouldn't recommend a user do this but we have seen similar workloads in the wild. For example the forum post was a week's worth of hourly incrementals (168 backups), which would take ~9 minutes.

      Initial investigation suggests that on a 26 backup repository it took around 3.31s to populate a single backup's files - ~2.31s being listing/downloading and ~1s for unzipping the range, failover log, snapshot and stats files. On mac the latter is far worse - probably due to worse IO performance.

      Some initials ideas:

      1. We currently list files sequentially to discover what repos exist. We could do this concurrently
      2. We have a worker pool for unzipping, but only at the backup level. We could unzip the files for multiple backups concurrently. Unzipping four files at once seems to use very little CPU, so that is not too much of a worry

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              Matt.Hall Matt Hall
              Matt.Hall Matt Hall
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty