Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-49933

Investigate: 99.9th percentile GET latency (ms) increased by 30-40% in unbalanced server groups

    XMLWordPrintable

Details

    • 1

    Description

      https://hub.internal.couchbase.com/confluence/display/~bo-chun.wang/Unbalanced+Server+Group+Performance+Testing

      Compared to balanced server group, the 99.9th percentile read latency is increased by 30-40% in unbalance server group.

      Read Heavy (80/20 Read/Update)

      Number of Nodes Number of server groups 99.9th percentile latency Job
      6 2 (3 nodes + 3 nodes) GET latency (ms): 0.95
      SET latency (ms): 0.79
      http://perf.jenkins.couchbase.com/job/hercules-dev/72/
      5 2 (3 nodes + 2 nodes) GET latency (ms): 1.36
      SET latency (ms): 0.66
      http://perf.jenkins.couchbase.com/job/hercules-dev/71/

       

      Write Heavy (20/80 Read/Update)

      Number of Nodes Number of server groups 99.9th percentile latency Job
      6 2 (3 nodes + 3 nodes) GET latency (ms): 1.45
      SET latency (ms): 0.56
      http://perf.jenkins.couchbase.com/job/hercules-dev/76/
      5 2 (3 nodes + 2 nodes) GET latency (ms): 1.85
      SET latency (ms): 0.53
      http://perf.jenkins.couchbase.com/job/hercules-dev/75/

       

      Attachments

        1. 172.23.100.121.zip
          24.70 MB
        2. 172.23.100.124.zip
          20.05 MB
        3. screenshot-1.png
          screenshot-1.png
          99 kB
        4. screenshot-10.png
          screenshot-10.png
          67 kB
        5. screenshot-2.png
          screenshot-2.png
          86 kB
        6. screenshot-3.png
          screenshot-3.png
          89 kB
        7. screenshot-4.png
          screenshot-4.png
          99 kB
        8. screenshot-5.png
          screenshot-5.png
          82 kB
        9. screenshot-6.png
          screenshot-6.png
          112 kB
        10. screenshot-7.png
          screenshot-7.png
          141 kB
        11. screenshot-8.png
          screenshot-8.png
          124 kB
        12. screenshot-9.png
          screenshot-9.png
          73 kB
        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          dfinlay Dave Finlay added a comment - - edited

          Thanks for this Bo-Chun. I took a look at the read-heavy test. I didn't bother looking at the balanced case - just went straight to look at the unbalanced case. I looked at node .121, which is in server group 1 with 3 nodes and node .124, which is in server group 2 with 2 nodes.

          The server side 3 9s GET latency looks as follows on the 2 nodes - something right around 1 ms on .121 and approx around 1.4-1.5 ms on .124.

          GETs and SETs per second on the two nodes are very similar - this reflects the fact that actives are balanced and is what we want to see:

          It seems very likely that the latency difference is due to a residency ratio difference - and indeed this is the case. Item counts are 60 M active on both nodes with 40 M replica items on .121 and 90 M replica items on .124:

          Active residency ratios are .72 on .121 and .5 on .124:

          And this results in approx double the number of BG fetches on .124 as compared with .121 (1170/s vs 650/s):

          On .121 approx 27% of the GETs go to disk; on .124 approx 49% of the GETs go to disk – and I think this completely explains the difference between the latency in the 3 node server group and that of the 2 node server group.

          I'm not sure how much there is to add. I think this is essentially to be expected for this configuration.

          Shivani Gupta: interested in your thoughts.

          dfinlay Dave Finlay added a comment - - edited Thanks for this Bo-Chun. I took a look at the read-heavy test. I didn't bother looking at the balanced case - just went straight to look at the unbalanced case. I looked at node .121, which is in server group 1 with 3 nodes and node .124, which is in server group 2 with 2 nodes. The server side 3 9s GET latency looks as follows on the 2 nodes - something right around 1 ms on .121 and approx around 1.4-1.5 ms on .124. GETs and SETs per second on the two nodes are very similar - this reflects the fact that actives are balanced and is what we want to see: It seems very likely that the latency difference is due to a residency ratio difference - and indeed this is the case. Item counts are 60 M active on both nodes with 40 M replica items on .121 and 90 M replica items on .124: Active residency ratios are .72 on .121 and .5 on .124: And this results in approx double the number of BG fetches on .124 as compared with .121 (1170/s vs 650/s): On .121 approx 27% of the GETs go to disk; on .124 approx 49% of the GETs go to disk – and I think this completely explains the difference between the latency in the 3 node server group and that of the 2 node server group. I'm not sure how much there is to add. I think this is essentially to be expected for this configuration. Shivani Gupta : interested in your thoughts.

          Thanks Dave Finlay  for your analysis. It makes sense. Further it also explains why SET latencies are not impacted but GET latencies are impacted.

          Seems like completely expected behavior to me. But thanks for finding this Bo-Chun Wang . I am sure I will run into situations where I will have to explain this to users.

          shivani.gupta Shivani Gupta added a comment - Thanks Dave Finlay   for your analysis. It makes sense. Further it also explains why SET latencies are not impacted but GET latencies are impacted. Seems like completely expected behavior to me. But thanks for finding this Bo-Chun Wang . I am sure I will run into situations where I will have to explain this to users.
          dfinlay Dave Finlay added a comment -

          Sounds good. I will take a look at the write heavy workload at some point. I asked for it in case the extra compactions needed to be done in the 2 node server group caused an even greater effect. At least in this particular write-heavy test, it doesn't seem that the GET latencies degraded more than in the read-heavy test so there are no immediate red flags, but still we should take a closer look.

          I think the read-heavy test is a useful test in that it tested a system where the % of GETs going to disk is much higher in the 2 node server group. This is a complex function of residency ratio and cache misses and in some cases you won't see much change (e.g. sufficent memory to have the working set in memory), but at least this shows us that there can be quite a difference across the server groups - and of course it could be worse than this. We should document this and advise customers that it's preferable to run in a balanced configuration.

          dfinlay Dave Finlay added a comment - Sounds good. I will take a look at the write heavy workload at some point. I asked for it in case the extra compactions needed to be done in the 2 node server group caused an even greater effect. At least in this particular write-heavy test, it doesn't seem that the GET latencies degraded more than in the read-heavy test so there are no immediate red flags, but still we should take a closer look. I think the read-heavy test is a useful test in that it tested a system where the % of GETs going to disk is much higher in the 2 node server group. This is a complex function of residency ratio and cache misses and in some cases you won't see much change (e.g. sufficent memory to have the working set in memory), but at least this shows us that there can be quite a difference across the server groups - and of course it could be worse than this. We should document this and advise customers that it's preferable to run in a balanced configuration.

          Point noted about documentation. 

          shivani.gupta Shivani Gupta added a comment - Point noted about documentation. 
          dfinlay Dave Finlay added a comment -

          I took a look at the write-heavy workload. First I looked at the 3 node server group vs the 2 node server group. There's a small difference in server-side 3 9s GET latency - ~1.95 ms vs about ~2.03 ms:

          The active and replica items and active and replica residency ratios are the same as in the read-heavy case. I'll just show the residency ratios:

          And we see a difference in BG fetches - though a smaller relative difference than in the read-heavy case (190/s vs 300/s):

          This seems likely to account for the smaller difference in GET latencies than in the read-heavy test.

          Unfortunately there were no compactions in this time, so the test doesn't show the effect of the smaller server group carrying more vbuckets and having to do more compactions per minute than the larger server group. Perhaps we need to tweak the test and run again to get compaction happening during the test.

          I did compare the node .121 in the balanced version of the test with the same node in the unbalanced version of the test. Given that it's a 6 node cluster versus a 5 node cluster, we see the a difference in residency ratio. E.g. in the 6 node test we see close to 100% active resident ratio:

          And correspondingly we see a low number of BG fetches / s (around 25/s):

          Which explains the better 3 9s latency in the balanced test.

          dfinlay Dave Finlay added a comment - I took a look at the write-heavy workload. First I looked at the 3 node server group vs the 2 node server group. There's a small difference in server-side 3 9s GET latency - ~1.95 ms vs about ~2.03 ms: The active and replica items and active and replica residency ratios are the same as in the read-heavy case. I'll just show the residency ratios: And we see a difference in BG fetches - though a smaller relative difference than in the read-heavy case (190/s vs 300/s): This seems likely to account for the smaller difference in GET latencies than in the read-heavy test. Unfortunately there were no compactions in this time, so the test doesn't show the effect of the smaller server group carrying more vbuckets and having to do more compactions per minute than the larger server group. Perhaps we need to tweak the test and run again to get compaction happening during the test. I did compare the node .121 in the balanced version of the test with the same node in the unbalanced version of the test. Given that it's a 6 node cluster versus a 5 node cluster, we see the a difference in residency ratio. E.g. in the 6 node test we see close to 100% active resident ratio: And correspondingly we see a low number of BG fetches / s (around 25/s): Which explains the better 3 9s latency in the balanced test.

          Discussed with Ray Offiah  to add the following sentence where we discuss unequal server groups in the documentation:

          "Since smaller server groups will carry more replicas on each node, their performance may not be the same as the nodes in the bigger server groups. For best performance, try to maintain equality of server groups."

          shivani.gupta Shivani Gupta added a comment - Discussed with Ray Offiah   to add the following sentence where we discuss unequal server groups in the documentation: "Since smaller server groups will carry more replicas on each node, their performance may not be the same as the nodes in the bigger server groups. For best performance, try to maintain equality of server groups."

          People

            dfinlay Dave Finlay
            bo-chun.wang Bo-Chun Wang
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty