Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-14562

goxdcr uses 2x memory compared to memcached invoking linux OOM killer

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • 4.0.0
    • 4.0.0
    • XDCR
    • Security Level: Public
    • centOS 6.x, 15Gb RAM, 4 cores

    Description

      Build
      ------
      4.0.0-1867

      Following up on MB-14457 (observed on 172.23.105.58 during system test)

      goxdcr uses >2X memory(4GB) compared to memcached(2GB). Left with no swap space, Linux OOM killer first kills goxdcr and then goes after memcached. Since it causes memc to be killed, I'm marking this as blocker.

      .58 is a VM with 15GB ram and 4 cores of CPU.

      Attaching log from .58

      In /var/log/messages:

      Apr 16 01:33:44 guinep-s10505 kernel: Free swap = 0kB
      Apr 16 01:33:44 guinep-s10505 kernel: Total swap = 10727416kB
      Apr 16 01:33:44 guinep-s10505 kernel: 4019199 pages RAM
      Apr 16 01:33:44 guinep-s10505 kernel: 123935 pages reserved
      Apr 16 01:33:44 guinep-s10505 kernel: 14733 pages shared
      Apr 16 01:33:44 guinep-s10505 kernel: 3867796 pages non-shared
      Apr 16 01:33:44 guinep-s10505 kernel: [ 2899] 0 2899 4033 963 0 0 0 atop
      Apr 16 01:33:44 guinep-s10505 kernel: [ 3760] 498 3760 314477 2356 3 0 0 beam.smp
      Apr 16 01:33:44 guinep-s10505 kernel: [ 3799] 498 3799 506818 119852 3 0 0 beam.smp
      Apr 16 01:33:44 guinep-s10505 kernel: [ 3839] 498 3839 26517 200 0 0 0 sh
      Apr 16 01:33:44 guinep-s10505 kernel: [ 3840] 498 3840 1014 88 3 0 0 cpu_sup
      Apr 16 01:33:44 guinep-s10505 kernel: [ 3841] 498 3841 1014 138 3 0 0 memsup
      Apr 16 01:33:44 guinep-s10505 kernel: [ 3854] 498 3854 2698 104 2 0 0 inet_gethost
      Apr 16 01:33:44 guinep-s10505 kernel: [ 3855] 498 3855 2698 82 0 0 0 inet_gethost
      Apr 16 01:33:44 guinep-s10505 kernel: [ 3856] 498 3856 251734 6421 1 0 0 beam.smp
      Apr 16 01:33:44 guinep-s10505 kernel: [ 3905] 498 3905 26518 200 3 0 0 sh
      Apr 16 01:33:44 guinep-s10505 kernel: [ 3906] 498 3906 1015 135 0 0 0 memsup
      Apr 16 01:33:44 guinep-s10505 kernel: [ 3907] 498 3907 1015 88 0 0 0 cpu_sup
      Apr 16 01:33:44 guinep-s10505 kernel: [ 3910] 498 3910 1492 194 0 0 0 godu
      Apr 16 01:33:44 guinep-s10505 kernel: [ 3911] 498 3911 26517 200 3 0 0 sh
      Apr 16 01:33:44 guinep-s10505 kernel: [ 3912] 498 3912 1491 95 0 0 0 godu
      Apr 16 01:33:44 guinep-s10505 kernel: [ 3920] 498 3920 50176 225 1 0 0 moxi
      Apr 16 01:33:44 guinep-s10505 kernel: [ 3925] 498 3925 2098 166 0 0 0 sigar_port
      Apr 16 01:33:44 guinep-s10505 kernel: [ 3929] 498 3929 1620 123 0 0 0 goport
      Apr 16 01:33:44 guinep-s10505 kernel: [ 3930] 498 3930 46938 237 1 0 0 saslauthd-port
      Apr 16 01:33:44 guinep-s10505 kernel: [ 3932] 498 3932 1620 124 1 0 0 goport
      Apr 16 01:33:44 guinep-s10505 kernel: [ 3933] 498 3933 115706 577 2 0 0 moxi
      Apr 16 01:33:44 guinep-s10505 kernel: [ 3934] 498 3934 115739 601 1 0 0 moxi
      Apr 16 01:33:44 guinep-s10505 kernel: [ 3935] 498 3935 232491 568 3 0 0 beam.smp
      Apr 16 01:33:44 guinep-s10505 kernel: [ 3940] 498 3940 4397871 2749351 3 0 0 goxdcr
      Apr 16 01:33:44 guinep-s10505 kernel: [ 3942] 498 3942 109292 517 0 0 0 projector
      Apr 16 01:33:44 guinep-s10505 kernel: [ 3993] 498 3993 2071616 836127 0 0 0 memcached
      Apr 16 01:33:44 guinep-s10505 kernel: [ 4011] 498 4011 2699 104 3 0 0 inet_gethost
      Apr 16 01:33:44 guinep-s10505 kernel: [ 4012] 498 4012 2699 82 3 0 0 inet_gethost
      Apr 16 01:33:44 guinep-s10505 kernel: [16341] 498 16341 2698 78 2 0 0 inet_gethost
      Apr 16 01:33:44 guinep-s10505 kernel: [20783] 0 20783 4034 964 0 0 0 atop
      Apr 16 01:33:44 guinep-s10505 kernel: [24016] 89 24016 19688 401 0 0 0 pickup
      Apr 16 01:33:44 guinep-s10505 kernel: [25028] 0 25028 1017 128 0 0 0 sleep
      Apr 16 01:33:44 guinep-s10505 kernel: Out of memory: Kill process 3940 (goxdcr) score 646 or sacrifice child
      Apr 16 01:33:44 guinep-s10505 kernel: Killed process 3940, UID 498, (goxdcr) total-vm:17591484kB, anon-rss:10996992kB, file-rss:412kB
      :
      :
      Apr 16 01:33:44 guinep-s10505 kernel: Out of memory: Kill process 3993 (memcached) score 302 or sacrifice child
      Apr 16 01:33:44 guinep-s10505 kernel: Killed process 3993, UID 498, (memcached) total-vm:8286464kB, anon-rss:3343632kB, file-rss:860kB

      [user:info,2015-04-16T1:33:47.426,ns_1@172.23.105.58:<0.211.0>:ns_log:crash_consumption_loop:70]Port server goxdcr on node 'babysitter_of_ns_1@127.0.0.1' exited with status 1. Restarting. Messages: PipelineManager 2015-04-16T01:33:42.973-07:00 [INFO] Replication Status = map[3754c145e93cbcddccbda66d99e2163a/standardbucket/standardbucket:name=

      {3754c145e93cbcddccbda66d99e2163a/standardbucket/standardbucket}

      , status=

      {Pending}

      , errors={[

      {"time":"2015-04-16T01:33:30.112118921-07:00","errMsg":"map[xmem_3754c145e93cbcddccbda66d99e2163a/standardbucket/standardbucket_172.23.105.48:11210_1:Xmem is stuck]"}

      ,

      debug.log:[user:info,2015-04-16T1:33:45.863,ns_1@172.23.105.58:<0.211.0>:ns_log:crash_consumption_loop:70]Port server memcached on node 'babysitter_of_ns_1@127.0.0.1' exited with status 137. Restarting. Messages: 2015-04-16T01:32:12.785397-07:00 WARNING (standardbucket) DCP (Producer) eq_dcpq:xdcr:dcp_3754c145e93cbcddccbda66d99e2163a/standardbucket/standardbucket_172.23.105.58:11210_1 - (vb 511) Sending disk snapshot with start seqno 0 and end seqno 32519

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            apiravi Aruna Piravi (Inactive)
            apiravi Aruna Piravi (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              PagerDuty