Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-49185

Windows Q1-Q3 Regressions

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • Neo
    • Neo
    • query
    • Untriaged
    • Windows 64-bit
    • 1
    • Unknown

    Description

      There appears to be two issues affecting Windows Q1-Q3 tests.

      First, between build 7.1.0-1085 and 7.1.0-1283, there are performance improvements in most tests with the exception being regressions in: 

      Avg. Query Throughput (queries/sec), Q2, Singleton Unique Lookup, MOI, not_bounded

      Avg. Query Throughput (queries/sec), Q2, Singleton Unique Lookup, Plasma, not_bounded

      Then, between build 7.1.0-1283 and 7.1.0-1345 regressions are seen in:

      Avg. Query Throughput (queries/sec), Q1, Key-Value Lookup

      Avg. Query Throughput (queries/sec), Q3, Range Scan, MOI, request_plus

      Avg. Query Throughput (queries/sec), Q3, Range Scan, MOI, not_bounded

      Avg. Query Throughput (queries/sec), Q3, Range Scan, Plasma, request_plus

      Avg. Query Throughput (queries/sec), Q3, Range Scan, Plasma, not_bounded

      The only test that have improved and are not showing regressions are: 

      Avg. Query Throughput (queries/sec), Q2, Singleton Unique Lookup, MOI, request_plus

      Avg. Query Throughput (queries/sec), Q2, Singleton Unique Lookup, Plasma, request_plus

       

      Attachments

        For Gerrit Dashboard: MB-49185
        # Subject Branch Project Status CR V

        Activity

          korrigan.clark Korrigan Clark added a comment - - edited

          Windows toy runs:

          q1 - http://perf.jenkins.couchbase.com/job/zeus/7666/ - 66705.0

          q2 plasma - http://perf.jenkins.couchbase.com/job/zeus/7667/ - 64067.0

          q3 moi - http://perf.jenkins.couchbase.com/job/zeus/7668/ - 1873.0

          Looking for q1 ~63k, q2 plasma ~65k, q3 moi ~2k

          Looks like q1 and q2 are better, but still slight regression in q3

           

          korrigan.clark Korrigan Clark added a comment - - edited Windows toy runs: q1 - http://perf.jenkins.couchbase.com/job/zeus/7666/  - 66705.0 q2 plasma - http://perf.jenkins.couchbase.com/job/zeus/7667/  - 64067.0 q3 moi - http://perf.jenkins.couchbase.com/job/zeus/7668/  - 1873.0 Looking for q1 ~63k, q2 plasma ~65k, q3 moi ~2k Looks like q1 and q2 are better, but still slight regression in q3  

          OK, compared to 1.16.6, Q3 is still better with 1.17.2, so I think we'll go ahead with the golang upgrade.  I'll look into Q3 afterwards (though I suspect it will be what it is).

          Donald.haggart Donald Haggart added a comment - OK, compared to 1.16.6, Q3 is still better with 1.17.2, so I think we'll go ahead with the golang upgrade.  I'll look into Q3 afterwards (though I suspect it will be what it is).

          Build couchbase-server-7.1.0-1606 contains query commit 1cd1de5 with commit message:
          MB-49185 Upgrade golang to 1.17.2 on Windows only.

          build-team Couchbase Build Team added a comment - Build couchbase-server-7.1.0-1606 contains query commit 1cd1de5 with commit message: MB-49185 Upgrade golang to 1.17.2 on Windows only.

          Linux:

          Tested:

          "Avg. Query Throughput (queries/sec), Q3, Range Scan, Plasma, request_plus"

          Note the test on 7.1.0-1283 since 7.1.0-1283 is being used as the benchmark on Windows:

          7.1.0-1250 -> 3693.0 (showfast)
          7.1.0-1283 -> 4236.0 (http://perf.jenkins.couchbase.com/job/iris-multi-client/12704/)
          7.1.0-1295 -> 4214.0 (http://perf.jenkins.couchbase.com/job/iris-multi-client/12731/)
          7.1.0-1306 -> 4186.0 (http://perf.jenkins.couchbase.com/job/iris-multi-client/12735/)
          7.1.0-1307 -> 3576.0 (http://perf.jenkins.couchbase.com/job/iris-multi-client/12737/)
          7.1.0-1345 -> 3936.0 (showfast)
          7.1.0-1357 -> 3955.0 (http://perf.jenkins.couchbase.com/job/iris-multi-client/12739/)
          7.1.0-1358 -> 3433.0 (http://perf.jenkins.couchbase.com/job/iris-multi-client/12741/) !!
          7.1.0-1358 -> 3611.0 (http://perf.jenkins.couchbase.com/job/iris-multi-client/12759/) !!
          7.1.0-1358 -> 3347.0 (http://perf.jenkins.couchbase.com/job/iris-multi-client/12771/) !!
          7.1.0-1558 -> 3806.0 (showfast)
          7.1.0-1601 -> 3943.0 (showfast)
          7.1.0-1611 -> 4131.0 (http://perf.jenkins.couchbase.com/job/iris-multi-client/12772/)

          OK, so current version (7.1.0-1611) is looking like it is all back to where we'd expect it to be. 1306->1307 just enables read from replicas; 1357->1358 just inverts the control. No material query changes in either.

          However, it seems this test is incredibly inconsistent - three runs with the exact same version (7.1.0-1358) and we have an ~8% variance. I suspect we can't read anything into changes of less than 10% and even those exceeding this need to be repeated several times I think before judging anything.

          -> I guess there is need to investigate the test for why it is varying so but think that the starting point should be a review of the test itself and environment.

          Just to check not recent code changes, ran it on 7.0.2-6703:

          7.0.2-6703 -> 3534.0 (showfast)
          7.0.2-6703 -> 3281.0 (http://perf.jenkins.couchbase.com/job/iris-multi-client/12774/)
          7.0.2-6703 -> 3284.0 (http://perf.jenkins.couchbase.com/job/iris-multi-client/12777/)

          (~7% variance)

          I don't think there is anything to pursue here, but nevertheless:

          Windows:

          7.1.0-1611 -> 1844.0 (http://perf.jenkins.couchbase.com/job/zeus/7692/)
          7.1.0-1611 -> 1746.0 (http://perf.jenkins.couchbase.com/job/zeus/7693/)
          7.1.0-1611 -> 1858.0 (http://perf.jenkins.couchbase.com/job/zeus/7694/)

          (~6% variance)

          Donald.haggart Donald Haggart added a comment - Linux: Tested: "Avg. Query Throughput (queries/sec), Q3, Range Scan, Plasma, request_plus" Note the test on 7.1.0-1283 since 7.1.0-1283 is being used as the benchmark on Windows: 7.1.0-1250 -> 3693.0 (showfast) 7.1.0-1283 -> 4236.0 ( http://perf.jenkins.couchbase.com/job/iris-multi-client/12704/ ) 7.1.0-1295 -> 4214.0 ( http://perf.jenkins.couchbase.com/job/iris-multi-client/12731/ ) 7.1.0-1306 -> 4186.0 ( http://perf.jenkins.couchbase.com/job/iris-multi-client/12735/ ) 7.1.0-1307 -> 3576.0 ( http://perf.jenkins.couchbase.com/job/iris-multi-client/12737/ ) 7.1.0-1345 -> 3936.0 (showfast) 7.1.0-1357 -> 3955.0 ( http://perf.jenkins.couchbase.com/job/iris-multi-client/12739/ ) 7.1.0-1358 -> 3433.0 ( http://perf.jenkins.couchbase.com/job/iris-multi-client/12741/ ) !! 7.1.0-1358 -> 3611.0 ( http://perf.jenkins.couchbase.com/job/iris-multi-client/12759/ ) !! 7.1.0-1358 -> 3347.0 ( http://perf.jenkins.couchbase.com/job/iris-multi-client/12771/ ) !! 7.1.0-1558 -> 3806.0 (showfast) 7.1.0-1601 -> 3943.0 (showfast) 7.1.0-1611 -> 4131.0 ( http://perf.jenkins.couchbase.com/job/iris-multi-client/12772/ ) OK, so current version (7.1.0-1611) is looking like it is all back to where we'd expect it to be. 1306->1307 just enables read from replicas; 1357->1358 just inverts the control. No material query changes in either. However, it seems this test is incredibly inconsistent - three runs with the exact same version (7.1.0-1358) and we have an ~8% variance. I suspect we can't read anything into changes of less than 10% and even those exceeding this need to be repeated several times I think before judging anything. -> I guess there is need to investigate the test for why it is varying so but think that the starting point should be a review of the test itself and environment. Just to check not recent code changes, ran it on 7.0.2-6703: 7.0.2-6703 -> 3534.0 (showfast) 7.0.2-6703 -> 3281.0 ( http://perf.jenkins.couchbase.com/job/iris-multi-client/12774/ ) 7.0.2-6703 -> 3284.0 ( http://perf.jenkins.couchbase.com/job/iris-multi-client/12777/ ) (~7% variance) I don't think there is anything to pursue here, but nevertheless: Windows: 7.1.0-1611 -> 1844.0 ( http://perf.jenkins.couchbase.com/job/zeus/7692/ ) 7.1.0-1611 -> 1746.0 ( http://perf.jenkins.couchbase.com/job/zeus/7693/ ) 7.1.0-1611 -> 1858.0 ( http://perf.jenkins.couchbase.com/job/zeus/7694/ ) (~6% variance)

          In line with my above testing I'm going to mark this as resolved now.  Fluctuations in Q3 results seem to be normal/expected and the current version seems to have recouped any major deficit.

          Donald.haggart Donald Haggart added a comment - In line with my above testing I'm going to mark this as resolved now.  Fluctuations in Q3 results seem to be normal/expected and the current version seems to have recouped any major deficit.

          People

            korrigan.clark Korrigan Clark
            korrigan.clark Korrigan Clark
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty