Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-21552

MOI memory steady increasing with constant number of items

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 5.0.0
    • 4.5.0, 4.5.1, 4.6.0
    • secondary-index
    • None

    Description

      On day 4 into longevity test, the indexer warned that 92% of memory was being used and all indexes entered into Paused state. Could be a leak somewhere because the usage increased each day and never seems to have gone down. Also, the item count levels out at ~15M on the bucket with a mixed workload of sets/gets/deletes, so I would expect indexer memory to also be fairly constant.

      Node has 30GB memory, and quota was 75%. The 92% warning here:

      [user:info,2016-10-28T18:05:07.684-07:00,ns_1@172.23.105.60:<0.7983.373>:menelaus_web_alerts_srv:global_alert:89]Warning: approaching max index RAM. Indexer RAM on node "172.23.105.60" is 92%, which is at or above the threshold of 75%.
      [ns_server:info,2016-10-28T18:05:07.685-07:00,ns_1@172.23.105.60:ns_log<0.1821.0>:ns_log:handle_cast:188]suppressing duplicate log menelaus_web_alerts_srv:undefined([<<"Warning: approaching max index RAM. Indexer RAM on node \"172.23.105.60\" is 92%, which is at or above the threshold of 75%.">>]) because it's been seen 17 times in the past 50.99999 secs (last seen 2.995559 secs ago
      

      Then Indexes are paused

      2016-10-28T16:06:30.180-07:00 [Info] Indexer::monitorMemUsage ManualGC Time Taken 967.896727ms
      2016-10-28T16:06:30.229-07:00 [Info] Indexer::ReadMemstats Time Taken 5.223638ms
      2016-10-28T16:06:30.229-07:00 [Info] Indexer::monitorMemUsage MemoryUsed Total 15062880256 Idle 65536
      2016-10-28T16:06:30.229-07:00 [Info] Indexer::handleIndexerPause
      2016-10-28T16:06:30.230-07:00 [Info] ClustMgr:handleSetLocalValue Key IndexerState Value Paused
      2016-10-28T16:06:30.236-07:00 [Info] Indexer::handleIndexerPause Indexer State Changed to Paused
      2016-10-28T16:06:30.236-07:00 [Info] Timekeeper::handleIndexerPause
      2016-10-28T16:06:30.237-07:00 [Info] MutationStreamReader::handleIndexerPause
      2016-10-28T16:06:30.237-07:00 [Info] MutationMgr::handleIndexerPause Stream MAINT_STREAM Paused
      

      I restarted the indexer and memory usage went down to 7GB and all indexes were active again.

      Snippets are from: https://s3.amazonaws.com/scalability-mcafee/collectinfo-2016-10-29T210217-ns_1%40172.23.105.60.zip
      have also attached here logs from other nodes and history of collects of indexer throughout the run for tracing memory usage.

      *Regression unkown as MOI is first time being run in longevity.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              sarath Sarath Lakshman
              tommie Tommie McAfee (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty