Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-32281

[System Test] Seeing OOM killing for indexer

    XMLWordPrintable

Details

    Description

      Build : 6.0.1-1992
      Test : -test tests/2i/test_idx_rebalance_replica_vulcan_kv_opt.yml -scope tests/2i/scope_idx_rebalance_replica_vulcan.yml
      Scale: 3
      Iteration : 2

      In the 2nd iteration of this test, we have observed that the indexer service on 172.23.104.25 experienced OOM killings by the OS twice :

      [ns_server:info,2018-12-06T22:57:13.488-08:00,babysitter_of_ns_1@127.0.0.1:indexer-goport<0.30503.9>:goport:handle_port_os_exit:458]Port exited with status 137
                                      {[{<<"[goport(/opt/couchbase/bin/indexer)] 2018/12/06 22:57:13 child process exited with status 137\n">>,
      [ns_server:info,2018-12-07T06:05:31.796-08:00,babysitter_of_ns_1@127.0.0.1:indexer-goport<0.16852.17>:goport:handle_port_os_exit:458]Port exited with status 137
                                      {[{<<"[goport(/opt/couchbase/bin/indexer)] 2018/12/07 06:05:31 child process exited with status 137\n">>,
      

      On checking the indexer logs as to when was the last time the memQuotaTuner tuned the quota to prevent OOM killings before the last OOM killing at 2018-12-07T06:05:31, I see that it was long before this time.

      zgrep -i "freePercent" indexer.log* | grep -i newQuota

      Last few times before the OOM killing:

      indexer.log.3.gz:2018-12-07T03:34:44.063-08:00 [Info] Plasma: Adaptive memory quota tuning (incrementing): RSS:20517277696, freePercent: 13.062550501379903, currentQuota=4067237560, newQuota=4270599438
      indexer.log.3.gz:2018-12-07T03:36:50.740-08:00 [Info] Plasma: Adaptive memory quota tuning (incrementing): RSS:20469211136, freePercent: 13.118506805216432, currentQuota=4270599438, newQuota=4473961316
      indexer.log.3.gz:2018-12-07T03:57:32.498-08:00 [Info] Plasma: Adaptive memory quota tuning (decrementing): RSS:21414617088, freePercent:9.387591697781145, currentQuota=4473961316, newQuota=4067237560
      

      After the OOM killing:

      indexer.log.1.gz:2018-12-07T10:24:27.967-08:00 [Info] Plasma: Adaptive memory quota tuning (incrementing): RSS:20569579520, freePercent: 13.30386410089884, currentQuota=15455502729, newQuota=15658864607
      indexer.log.1.gz:2018-12-07T10:38:08.117-08:00 [Info] Plasma: Adaptive memory quota tuning (incrementing): RSS:20637208576, freePercent: 13.005957959686338, currentQuota=15658864607, newQuota=15862226485
      indexer.log.1.gz:2018-12-07T10:41:38.164-08:00 [Info] Plasma: Adaptive memory quota tuning (incrementing): RSS:20647723008, freePercent: 13.042174576163626, currentQuota=15862226485, newQuota=16065588363
      

      The freePercent was around 5% for quite some time before the OOM killing.

      indexer.log.2.gz:2018-12-07T06:04:05.033-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22460481536, freePercent:5.222436096024933, currentQuota=4067237560
      indexer.log.2.gz:2018-12-07T06:04:10.441-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22459219968, freePercent:5.226400347287988, currentQuota=4067237560
      indexer.log.2.gz:2018-12-07T06:04:15.896-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22461030400, freePercent:5.199694671083789, currentQuota=4067237560
      indexer.log.2.gz:2018-12-07T06:04:21.437-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22462640128, freePercent:5.199776240039819, currentQuota=4067237560
      indexer.log.2.gz:2018-12-07T06:04:26.665-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22465871872, freePercent:5.242991472944475, currentQuota=4067237560
      indexer.log.2.gz:2018-12-07T06:04:34.128-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22470647808, freePercent:5.1807869870760515, currentQuota=4067237560
      indexer.log.2.gz:2018-12-07T06:04:41.592-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22473359360, freePercent:5.1523031076304004, currentQuota=4067237560
      indexer.log.2.gz:2018-12-07T06:04:47.011-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22475464704, freePercent:5.19566516465591, currentQuota=4067237560
      indexer.log.2.gz:2018-12-07T06:04:52.365-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22476578816, freePercent:5.212027897235514, currentQuota=4067237560
      indexer.log.2.gz:2018-12-07T06:04:58.068-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22478561280, freePercent:5.1558105727396875, currentQuota=4067237560
      indexer.log.2.gz:2018-12-07T06:05:04.042-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22474186752, freePercent:5.169024743616536, currentQuota=4067237560
      indexer.log.2.gz:2018-12-07T06:05:09.431-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22473924608, freePercent:5.236531211626904, currentQuota=4067237560
      indexer.log.2.gz:2018-12-07T06:05:14.797-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22473342976, freePercent:5.169530471143921, currentQuota=4067237560
      indexer.log.2.gz:2018-12-07T06:05:20.392-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22473113600, freePercent:5.20046141927047, currentQuota=4067237560
      indexer.log.2.gz:2018-12-07T06:05:27.323-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22470930432, freePercent:5.1829240937240355, currentQuota=4067237560
      

      Shouldnt have the memQuotaTuner run more often and tuned the memory quota to above the 10% freePercent threshold to avoid running into OOM killing ?

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            sundar Sundar Sridharan (Inactive)
            mihir.kamdar Mihir Kamdar (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty