Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-22063

CLONE - couchdb oom without node recovering

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 4.6.0
    • 4.6.0
    • view-engine

    Description

      After 1.5 days into testing, a node went down and has not come back online. Looks to be same behavior seen in 3.1.3 but my understanding is MB-14068 fixes this in 4.0+. Would need confirmation if this is indeed duplicate issue:

      At time of OOM there's the dupe partition logs on 172.23.105.62:

      [couchdb:error,2016-11-04T04:48:28.957-07:00,couchdb_ns_1@127.0.0.1:<0.1258.0>:couch_log:error:44]set view `default`, mapreduce_view main (p
      rod) group `_design/scale` have the duplicate partition versions [{35,
      

      3 mins later we have OOM

      Nov  4 04:51:47 kvm-s63704 kernel: [19417698.447312] Out of memory: Kill process 9098 (beam.smp) score 674 or sacrifice child
      Nov  4 04:51:47 kvm-s63704 kernel: [19417698.453765] Killed process 9175 (godu) total-vm:11632kB, anon-rss:4588kB, file-rss:0kB
      Nov  4 04:51:47 kvm-s63704 kernel: [19417698.463027] beam.smp invoked oom-killer: gfp_mask=0x280da, order=0, oom_score_adj=0
      Nov  4 04:51:47 kvm-s63704 kernel: [19417698.463032] beam.smp cpuset=/ mems_allowed=0
      

      Also on this node memcached process is no longer running

      root@kvm-s63704:/opt/couchbase/var/lib/couchbase/logs# ps aux | grep [m]emcached
      root@kvm-s63704:/opt/couchbase/var/lib/couchbase/logs# 
      

      Last log from memcached was bucket shutdown 2 hrs ago

      2016-11-04T09:44:47.291209-07:00 NOTICE Shutting down OpenSSL
      2016-11-04T09:44:47.797552-07:00 NOTICE Shutting down libevent
      2016-11-04T09:44:47.797644-07:00 NOTICE Shutdown complete.
      

      So doesn't look like mcd is being targed by OOM, just couchdb.

      +Subsequent attempts to restart couchdb results in OOM

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              harsha Harsha Havanur
              arunkumar Arunkumar Senthilnathan (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty