Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-48834

Improve XDCR performance with Magma

    XMLWordPrintable

Details

    Description

      I re-run two existing XDCR tests with Magma. Compared to Couchstore, Magma performance is about 50% lower. I open this ticket to track XDCR+Magma performance improvement. All runs were running on build 7.1.0-1401.

       

      Avg. initial XDCR rate (items/sec), 1 -> 1 (2 source nozzles, 4 target nozzles), 1 bucket x 100M x 1KB

       

       

      Avg. initial XDCR rate (items/sec), 5 -> 5 (2 source nozzles, 4 target nozzles), 1 bucket x 250M x 1KB

       

      Attachments

        1. couchstore-105-backfills.png
          46 kB
          Daniel Owen
        2. couchstore-105-sentitems.png
          46 kB
          Daniel Owen
        3. magma-105.png
          43 kB
          Daniel Owen
        4. magma-105-backfills.png
          47 kB
          Daniel Owen
        5. magma-105-sentitems.png
          50 kB
          Daniel Owen
        6. Screen Shot 2021-10-27 at 2.00.57 PM.png
          199 kB
          Bo-Chun Wang
        7. Screen Shot 2021-10-27 at 2.05.55 PM.png
          184 kB
          Bo-Chun Wang
        8. Screen Shot 2021-10-27 at 2.08.14 PM.png
          178 kB
          Bo-Chun Wang
        9. Screen Shot 2021-10-27 at 2.09.06 PM.png
          175 kB
          Bo-Chun Wang
        10. Screen Shot 2021-10-27 at 2.15.50 PM.png
          125 kB
          Bo-Chun Wang
        11. Screen Shot 2022-01-12 at 4.42.26 PM.png
          111 kB
          Bo-Chun Wang
        12. Screenshot 2022-01-13 at 11.13.12 AM.png
          63 kB
          Sarath Lakshman
        13. Screenshot 2022-01-13 at 11.16.39 AM.png
          96 kB
          Sarath Lakshman
        14. Screenshot 2022-01-14 at 8.18.17 AM.png
          56 kB
          Sarath Lakshman

        Issue Links

          For Gerrit Dashboard: MB-48834
          # Subject Branch Project Status CR V

          Activity

            sarath Sarath Lakshman added a comment - - edited

            I have a prototype exposing magma bloom filter through Magma::KeyMayExist API and use that API to enable KeyMayExist check for magma in the frontend threads. Throughtput improved from 278766/s to 424707/s.

            We see p50 GET_META latency drop from 45us to 10us with this change which skips bgFetch code path. This is roughly equivalent to bg_load latency (12us) we observed in the prior run (saving bgFetch cost). We may want to investigate if the 35us for the bg fetch path is reasonable. I don't know if it is related to more frontend threads vs lesser bg fetch threads and hence frontend waiting for bg fetch threads. This needs to be validated.

            The following data is collected for "GET_META"¬
            [  0.00 -   3.00]us (0.0000%)▸       4532| ¬
            [  3.00 -   6.00]us (10.0000%)▸  33703673| ############################################¬
            [  6.00 -   7.00]us (20.0000%)▸  22812220| #############################¬
            [  7.00 -   8.00]us (30.0000%)▸  18972811| ########################¬
            [  8.00 -   9.00]us (40.0000%)▸  15037567| ###################¬
            [  9.00 -  10.00]us (50.0000%)▸  12711962| ################¬
            [ 10.00 -  11.00]us (55.0000%)▸  11717598| ###############¬
            [ 11.00 -  12.00]us (60.0000%)▸  11238608| ##############¬
            [ 12.00 -  13.00]us (65.0000%)▸  10593831| #############¬
            [ 13.00 -  14.00]us (70.0000%)▸   9490448| ############¬
            [ 14.00 -  15.00]us (75.0000%)▸   8026767| ##########¬
            [ 15.00 -  16.00]us (77.5000%)▸   6497436| ########¬
            [ 16.00 -  16.00]us (80.0000%)▸         0| ¬
            [ 16.00 -  17.00]us (82.5000%)▸   5172050| ######¬
            [ 17.00 -  19.00]us (85.0000%)▸   7659343| #########¬
            [ 19.00 -  20.00]us (87.5000%)▸   3021360| ###¬
            [ 20.00 -  21.00]us (88.7500%)▸   2690605| ###¬
            [ 21.00 -  22.00]us (90.0000%)▸   2422527| ###¬
            [ 22.00 -  23.00]us (91.2500%)▸   2178659| ##¬
            [ 23.00 -  24.00]us (92.5000%)▸   1916079| ##¬
            [ 24.00 -  26.00]us (93.7500%)▸   2995509| ###¬
            [ 26.00 -  27.00]us (94.3750%)▸   1136144| #¬
            [ 27.00 -  28.00]us (95.0000%)▸    936203| #¬
            [ 28.00 -  29.00]us (95.6250%)▸    772154| #¬
            [ 29.00 -  31.00]us (96.2500%)▸   1178621| #¬
            [ 31.00 -  35.00]us (96.8750%)▸   1460445| #¬
            [ 35.00 -  37.00]us (97.1875%)▸    457187| ¬
            [ 37.00 -  41.00]us (97.5000%)▸    573703| ¬
            [ 41.00 -  49.00]us (97.8125%)▸    475481| ¬
            [ 49.00 -  67.00]us (98.1250%)▸    604837| ¬
            [ 67.00 -  79.00]us (98.4375%)▸    669727| ¬
            [ 79.00 -  87.00]us (98.5938%)▸    495717| ¬
            [ 87.00 -  91.00]us (98.7500%)▸    232754| ¬
            

            http://perf.jenkins.couchbase.com/job/titan-xdcr-dev/92/console

            sarath Sarath Lakshman added a comment - - edited I have a prototype exposing magma bloom filter through Magma::KeyMayExist API and use that API to enable KeyMayExist check for magma in the frontend threads. Throughtput improved from 278766/s to 424707/s. We see p50 GET_META latency drop from 45us to 10us with this change which skips bgFetch code path. This is roughly equivalent to bg_load latency (12us) we observed in the prior run (saving bgFetch cost). We may want to investigate if the 35us for the bg fetch path is reasonable. I don't know if it is related to more frontend threads vs lesser bg fetch threads and hence frontend waiting for bg fetch threads. This needs to be validated. The following data is collected for "GET_META"¬ [ 0.00 - 3.00]us (0.0000%)▸ 4532| ¬ [ 3.00 - 6.00]us (10.0000%)▸ 33703673| ############################################¬ [ 6.00 - 7.00]us (20.0000%)▸ 22812220| #############################¬ [ 7.00 - 8.00]us (30.0000%)▸ 18972811| ########################¬ [ 8.00 - 9.00]us (40.0000%)▸ 15037567| ###################¬ [ 9.00 - 10.00]us (50.0000%)▸ 12711962| ################¬ [ 10.00 - 11.00]us (55.0000%)▸ 11717598| ###############¬ [ 11.00 - 12.00]us (60.0000%)▸ 11238608| ##############¬ [ 12.00 - 13.00]us (65.0000%)▸ 10593831| #############¬ [ 13.00 - 14.00]us (70.0000%)▸ 9490448| ############¬ [ 14.00 - 15.00]us (75.0000%)▸ 8026767| ##########¬ [ 15.00 - 16.00]us (77.5000%)▸ 6497436| ########¬ [ 16.00 - 16.00]us (80.0000%)▸ 0| ¬ [ 16.00 - 17.00]us (82.5000%)▸ 5172050| ######¬ [ 17.00 - 19.00]us (85.0000%)▸ 7659343| #########¬ [ 19.00 - 20.00]us (87.5000%)▸ 3021360| ###¬ [ 20.00 - 21.00]us (88.7500%)▸ 2690605| ###¬ [ 21.00 - 22.00]us (90.0000%)▸ 2422527| ###¬ [ 22.00 - 23.00]us (91.2500%)▸ 2178659| ##¬ [ 23.00 - 24.00]us (92.5000%)▸ 1916079| ##¬ [ 24.00 - 26.00]us (93.7500%)▸ 2995509| ###¬ [ 26.00 - 27.00]us (94.3750%)▸ 1136144| #¬ [ 27.00 - 28.00]us (95.0000%)▸ 936203| #¬ [ 28.00 - 29.00]us (95.6250%)▸ 772154| #¬ [ 29.00 - 31.00]us (96.2500%)▸ 1178621| #¬ [ 31.00 - 35.00]us (96.8750%)▸ 1460445| #¬ [ 35.00 - 37.00]us (97.1875%)▸ 457187| ¬ [ 37.00 - 41.00]us (97.5000%)▸ 573703| ¬ [ 41.00 - 49.00]us (97.8125%)▸ 475481| ¬ [ 49.00 - 67.00]us (98.1250%)▸ 604837| ¬ [ 67.00 - 79.00]us (98.4375%)▸ 669727| ¬ [ 79.00 - 87.00]us (98.5938%)▸ 495717| ¬ [ 87.00 - 91.00]us (98.7500%)▸ 232754| ¬ http://perf.jenkins.couchbase.com/job/titan-xdcr-dev/92/console

            Dave Rigby The goal here is not to measure the highest throughput for XDCR with magma/couchstore. Compared to the current standard XDCR test (target nozzle=4), magma XDCR throughput is lower due to higher GET_META latency. We are investigating whether we can achieve a similar number as couchstore.

            sarath Sarath Lakshman added a comment - Dave Rigby The goal here is not to measure the highest throughput for XDCR with magma/couchstore. Compared to the current standard XDCR test (target nozzle=4), magma XDCR throughput is lower due to higher GET_META latency. We are investigating whether we can achieve a similar number as couchstore.
            drigby Dave Rigby added a comment - - edited

            If we have a throughput test which kv-engine is being measured I would assert that it should be calibrated so kv-engine is essentially pegged one way or another. Otherwise, if say kv-engine cost for SetWithMeta went up by 10% (we suddenly did something which took 10% more CPU) then one would not necessarily notice if CPU went up by 10% (but throughput was sustained).

            If the test is instead measuring the throughput of XDCR itself, then it should be the bottleneck - which given the speedup we get increasing nozzles, that doesn’t seem to be the case either.

            Essentially a max throughput test should be pegging at least one of the components under test; and that doesn’t appear to be the case here.

            drigby Dave Rigby added a comment - - edited If we have a throughput test which kv-engine is being measured I would assert that it should be calibrated so kv-engine is essentially pegged one way or another. Otherwise, if say kv-engine cost for SetWithMeta went up by 10% (we suddenly did something which took 10% more CPU) then one would not necessarily notice if CPU went up by 10% (but throughput was sustained). If the test is instead measuring the throughput of XDCR itself, then it should be the bottleneck - which given the speedup we get increasing nozzles, that doesn’t seem to be the case either. Essentially a max throughput test should be pegging at least one of the components under test; and that doesn’t appear to be the case here.
            jliang John Liang added a comment -

            > is nozzles=4 still a good default? Is that something we already tell customers to increase?

            We don't have a sizing formula for nozzle. For example, the number of nozzle can depend on the number of CPU at target clusters. If default is too high, it could get to CPU saturation or temp failure. So I would rather keep it as it is. If source cluster has mutation rate higher than 80K per node, then customer can increase the number of nozzle, one at a time until it reaches the desired throughput.

            jliang John Liang added a comment - > is nozzles=4 still a good default? Is that something we already tell customers to increase? We don't have a sizing formula for nozzle. For example, the number of nozzle can depend on the number of CPU at target clusters. If default is too high, it could get to CPU saturation or temp failure. So I would rather keep it as it is. If source cluster has mutation rate higher than 80K per node, then customer can increase the number of nozzle, one at a time until it reaches the desired throughput.

            Build couchbase-server-7.1.0-2168 contains magma commit 8680726 with commit message:
            MB-48834 magma: Introduce KeyMayExist API

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.1.0-2168 contains magma commit 8680726 with commit message: MB-48834 magma: Introduce KeyMayExist API

            People

              sarath Sarath Lakshman
              bo-chun.wang Bo-Chun Wang
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:

                Gerrit Reviews

                  There is 1 open Gerrit change

                  PagerDuty