Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-26738

memcached fails to serve KV requests during swap rebalance due to OOM failures

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 7.1.0
    • 4.1.2, 4.6.4, 5.0.0, 5.5.0
    • couchbase-bucket
    • CentOS 7
      E5-2680 v3 (48 vCPU)
      64 GB RAM
      Samsung PM863a SATA SSD

    Description

      Test scenario:

      • 4 nodes
      • 1 bucket, 1 replica, full eviction
      • 1B items (~1KB), 5-10% resident ratio
      • 15K ops/sec (90% read, 10% update), 10% cache miss ratio (before rebalance)
      • Swap rebalance of one node (172.23.96.103 -> 172.23.96.104)

      Swap rebalance causes high memory usage on the new node and, as a result, thousands of TMP OOM failures. Although the number of failures is not that big (<100K), TMP OOM errors obviously cause significant drops in the rate of the incoming GET and SET requests.

      Graphs: http://cbmonitor.sc.couchbase.com/reports/html/?snapshot=titan_510-1368_rebalance_a416

      Attachments

        1. curr_items.png
          curr_items.png
          229 kB
        2. disk_queue.png
          disk_queue.png
          177 kB
        3. ep_num_eject_failures.png
          ep_num_eject_failures.png
          350 kB
        4. mem_used.png
          mem_used.png
          209 kB
        5. resident_ratio.png
          resident_ratio.png
          176 kB
        6. Screen Shot 2018-01-11 at 14.06.44.png
          Screen Shot 2018-01-11 at 14.06.44.png
          198 kB
        7. Screen Shot 2018-01-11 at 14.07.34.png
          Screen Shot 2018-01-11 at 14.07.34.png
          202 kB
        8. tmp_oom_5.1.0_1500.png
          tmp_oom_5.1.0_1500.png
          181 kB
        9. tmp_oom.png
          tmp_oom.png
          556 kB

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              owend Daniel Owen
              pavelpaulau Pavel Paulau (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty