Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-21470

Query engine requires memory tuning and optimization after upgrade to Go 1.7 (aka 8K problem)

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 5.0.0
    • 5.0.0
    • query
    • Hera cluster
    • Untriaged
    • Centos 64-bit
    • Yes

    Description

      Let me start with a brief summary of results for Q1 and Q2 not_bounded queries. Note, I was using MOI for Q2 tests.

      Q1 1.4.2 GOGC=100 1.4.2 GOGC=200 1.7.1 GOGC=100 1.7.1 GOGC=200 1.7.3 GOGC=100 Master GOGC=100 Master GOGC=200
      Throughput 25K 28K 8K 22K 8K 13K 26K
      RSS, MB 145 200 65 160 65 115 170
      Q2 1.4.2 GOGC=100 1.4.2 GOGC=200 1.7.1 GOGC=100 1.7.1 GOGC=200 1.7.3 GOGC=100 Master GOGC=100 Master GOGC=200
      Throughput 22K 24K 19K 22K 19K 30K 34K
      RSS, MB 350 475 300 375 315 275 395

      I will focus on the most severe regression - Q1 after upgrade to Go 1.7.

      Go 1.7 significantly reduces memory footprint. Unfortunately, that improvement had a very negative impact on the query engine performance:

      • Live heap size is very small now - 14-16MB.
      • The corresponding goal heap size is 30-35MB (GOGC=100).
      • A small live heap size and a huge amount of garbage result into extremely frequent garbage collection events. According to my measurements, ~175 GC events per second happen during the Q1 workload.
      • Sweep termination and mark termination are still stop-the-world phases. The query engine ends up spending about 20% of time in STW GC. That is a huge overhead.
      • The throughput is just 7-8K queries per second.

      Now the funny part. What if I run Q2 workload right before Q1 workload?

      • Q2 increases live heap size to ~150MB.
      • GC frequency decreases to ~20 events per second.
      • The query engine spends only 2-3% of time in STW GC.
      • Q1 throughput increases to 26K queries per second.

      In a way, Q2 before Q1 is equivalent go applying custom GOGC settings.

      Upgrade to Go 1.7 is inevitable. We can tune GOGC, we can tune number of "servicers".

      That said, we really need to optimize the memory allocation.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              pavelpaulau Pavel Paulau (Inactive)
              pavelpaulau Pavel Paulau (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty