Details
-
Bug
-
Resolution: Won't Fix
-
Critical
-
6.0.1
-
component cluster
-
Untriaged
-
-
Unknown
-
Storage-Sprint-Dec-21-2018
Description
Build : 6.0.1-1992
Test : -test tests/2i/test_idx_rebalance_replica_vulcan_kv_opt.yml -scope tests/2i/scope_idx_rebalance_replica_vulcan.yml
Scale: 3
Iteration : 2
In the 2nd iteration of this test, we have observed that the indexer service on 172.23.104.25 experienced OOM killings by the OS twice :
[ns_server:info,2018-12-06T22:57:13.488-08:00,babysitter_of_ns_1@127.0.0.1:indexer-goport<0.30503.9>:goport:handle_port_os_exit:458]Port exited with status 137
|
{[{<<"[goport(/opt/couchbase/bin/indexer)] 2018/12/06 22:57:13 child process exited with status 137\n">>,
|
[ns_server:info,2018-12-07T06:05:31.796-08:00,babysitter_of_ns_1@127.0.0.1:indexer-goport<0.16852.17>:goport:handle_port_os_exit:458]Port exited with status 137
|
{[{<<"[goport(/opt/couchbase/bin/indexer)] 2018/12/07 06:05:31 child process exited with status 137\n">>,
|
On checking the indexer logs as to when was the last time the memQuotaTuner tuned the quota to prevent OOM killings before the last OOM killing at 2018-12-07T06:05:31, I see that it was long before this time.
zgrep -i "freePercent" indexer.log* | grep -i newQuota
Last few times before the OOM killing:
indexer.log.3.gz:2018-12-07T03:34:44.063-08:00 [Info] Plasma: Adaptive memory quota tuning (incrementing): RSS:20517277696, freePercent: 13.062550501379903, currentQuota=4067237560, newQuota=4270599438
|
indexer.log.3.gz:2018-12-07T03:36:50.740-08:00 [Info] Plasma: Adaptive memory quota tuning (incrementing): RSS:20469211136, freePercent: 13.118506805216432, currentQuota=4270599438, newQuota=4473961316
|
indexer.log.3.gz:2018-12-07T03:57:32.498-08:00 [Info] Plasma: Adaptive memory quota tuning (decrementing): RSS:21414617088, freePercent:9.387591697781145, currentQuota=4473961316, newQuota=4067237560
|
After the OOM killing:
indexer.log.1.gz:2018-12-07T10:24:27.967-08:00 [Info] Plasma: Adaptive memory quota tuning (incrementing): RSS:20569579520, freePercent: 13.30386410089884, currentQuota=15455502729, newQuota=15658864607
|
indexer.log.1.gz:2018-12-07T10:38:08.117-08:00 [Info] Plasma: Adaptive memory quota tuning (incrementing): RSS:20637208576, freePercent: 13.005957959686338, currentQuota=15658864607, newQuota=15862226485
|
indexer.log.1.gz:2018-12-07T10:41:38.164-08:00 [Info] Plasma: Adaptive memory quota tuning (incrementing): RSS:20647723008, freePercent: 13.042174576163626, currentQuota=15862226485, newQuota=16065588363
|
The freePercent was around 5% for quite some time before the OOM killing.
indexer.log.2.gz:2018-12-07T06:04:05.033-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22460481536, freePercent:5.222436096024933, currentQuota=4067237560
|
indexer.log.2.gz:2018-12-07T06:04:10.441-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22459219968, freePercent:5.226400347287988, currentQuota=4067237560
|
indexer.log.2.gz:2018-12-07T06:04:15.896-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22461030400, freePercent:5.199694671083789, currentQuota=4067237560
|
indexer.log.2.gz:2018-12-07T06:04:21.437-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22462640128, freePercent:5.199776240039819, currentQuota=4067237560
|
indexer.log.2.gz:2018-12-07T06:04:26.665-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22465871872, freePercent:5.242991472944475, currentQuota=4067237560
|
indexer.log.2.gz:2018-12-07T06:04:34.128-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22470647808, freePercent:5.1807869870760515, currentQuota=4067237560
|
indexer.log.2.gz:2018-12-07T06:04:41.592-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22473359360, freePercent:5.1523031076304004, currentQuota=4067237560
|
indexer.log.2.gz:2018-12-07T06:04:47.011-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22475464704, freePercent:5.19566516465591, currentQuota=4067237560
|
indexer.log.2.gz:2018-12-07T06:04:52.365-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22476578816, freePercent:5.212027897235514, currentQuota=4067237560
|
indexer.log.2.gz:2018-12-07T06:04:58.068-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22478561280, freePercent:5.1558105727396875, currentQuota=4067237560
|
indexer.log.2.gz:2018-12-07T06:05:04.042-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22474186752, freePercent:5.169024743616536, currentQuota=4067237560
|
indexer.log.2.gz:2018-12-07T06:05:09.431-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22473924608, freePercent:5.236531211626904, currentQuota=4067237560
|
indexer.log.2.gz:2018-12-07T06:05:14.797-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22473342976, freePercent:5.169530471143921, currentQuota=4067237560
|
indexer.log.2.gz:2018-12-07T06:05:20.392-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22473113600, freePercent:5.20046141927047, currentQuota=4067237560
|
indexer.log.2.gz:2018-12-07T06:05:27.323-08:00 [Info] Plasma: Adaptive memory quota tuning RSS:22470930432, freePercent:5.1829240937240355, currentQuota=4067237560
|
Shouldnt have the memQuotaTuner run more often and tuned the memory quota to above the 10% freePercent threshold to avoid running into OOM killing ?