Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-14602

indexer gets stuck in OOM loop, index building never completes

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 4.0.0
    • 4.0.0
    • secondary-index
    • Security Level: Public
    • Version: 4.0.0-1891 Enterprise Edition (build-1891)
    • Untriaged
    • Ubuntu 64-bit
    • Unknown

    Description

      Problem:
      During primary index creation, the indexer is using huge amounts of memory. Eventually the Linux OOM kicks in and kills the indexer process. When the indexer is automatically restarted, it begins the indexing process again from the beginning, and again uses a huge amount of memory until the Linux OOM kicks in and kills the indexer.

      The indexer never completes the primary index building as it is stuck in a loop of being killed by OOM and then starting from the beginning.

      Step to reproduce:

      Data was loaded as KVs into bucket "noaa" on data nodes.
      (Data set can be provided if required)

      N1QL command was issued to build indexes:

      CREATE PRIMARY INDEX ON noaa USING GSI;

      <Index building never completes, as described above>

      The indexer usually only gets as far as building approx 3GB of indexes before being killed. WHen it is creating these ~3GB of indexes, it is using upto and above 120GB of RAM:

      PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
      3041 couchbas 20 0 117g 116g 8524 S 217 92.5 26:42.75 indexer

      Eventually OOM kicks in and kills the indexer:

      Apr 17 17:37:24 (none) kernel: [ 3778.569957] Out of memory: Kill process 3041 (indexer) score 987 or sacrifice child
      Apr 17 17:37:24 (none) kernel: [ 3778.569984] Killed process 3041 (indexer) total-vm:135038264kB, anon-rss:130141676kB, file-rss:0kB

      Dataset:
      135 Million flat JSON documents, each of about 350bytes each.
      Approximately 50GB of data in total.

      Data is 100% in memory.

      Environment:
      3 Data Nodes, each with 128GB RAM and 20 Cores (40 HT)
      1 Index now, with 128GB RAM and 20 cores (40 HT)
      1 Query Node, with 128GB RAM 20 cores (40 HT)

      Bucket Quota: 232GB Total (77GB per data node)

      Other Notes:

      After indexer failed a number of times, I tried increase the Indexer thread count in settings to 20, this is still set at that level.
      Note: This problem existed before I changed the setting.

      Collecting debug logs to upload now.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            deepkaran.salooja Deepkaran Salooja
            tom.green Tom Green (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty