Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-18426

Change the default number of compactors that can run at once from 3 to 1

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Fixed
    • Critical
    • 4.5.1
    • 3.1.3, 4.1.0, 4.5.0
    • couchbase-bucket, ns_server
    • Security Level: Public

    Description

      This settings is controlled by ns_server but I believe it should be the couchbase-bucket component that should make this call. I believe the default setting of 3 is too high, currently the setting can be changed at run time using the snippet below:

      curl -X POST -u Administrator:password http://localhost:8091/diag/eval -d 'ns_config:set(compaction_number_of_kv_workers, 1).'
      

      In 3 different use cases the support team have recommend changing this setting and overall it has had a positive impact on the cluster:

      Use case one:

      In a DGM situation where there was a small number of gets going to disk. When the compactor kicks in with 3 workers it would saturate the disk IO causing bg fetches to operate slower. When compaction_number_of_kv_workers was set to 1 there was less of a slow down on the bg fetches. There was a very small increase in the disk space size.

      Use case two:

      In a heavy write use case where a document is written to once, read once then deleted. The documents are only stored in the bucket until they are processed, so in a normal state a document only exists for a few minutes. It was noted that when the 3 compactors ran the disk would be come full saturated and this would cause huge spikes in the disk queues up to 36 millions. When the compaction_number_of_kv_workers was reduced, it had the following impacts:

      • Reduce the disk queue to ten of thousands from 36 million, which also reduce the amount of memory the bucket was using.
      • Reduce the disk IO
      • Most surprisingly reducing the disk space required from 280GB to 40GB. I suspect this was because the deletes were being blocked on the disk write queue
      • The fragmentation percentages fluctuates a lot more

      Use case three:

      This is a high performance use case with an extremely high number of writes. When compaction kicked in it would cause the disk queues to spike.
      Once they reduce compaction_number_of_kv_workers to 1, the disk write queue was better, it did cause a small increase in the disk space required.

      I think by making this change we smooth things out. The compactor will be running more often and overall it will take long, however when it is running it has less of an effect on the cluster.

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-18426
          # Subject Branch Project Status CR V

          Activity

            People

              drigby Dave Rigby (Inactive)
              pvarley Patrick Varley (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty