Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-56850

[BP 7.1.5] - XDCR - Prolonged TMPFAIL or ENOMEM could lead to memory bloat

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 7.1.5
    • 6.6.0, 6.6.1, 6.6.2, 6.6.3, 6.6.4, 6.6.5, 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.1.0, 7.1.1
    • gomemcached, XDCR
    • Security Level: Public
    • Untriaged
    • 1
    • No

    Description

      1. XDCR receives ENOMEM and prepares the packet for resend (http://src.couchbase.org/source/xref/6.6.2/goproj/src/github.com/couchbase/goxdcr/parts/xmem_nozzle.go#2041)
      2. Leading to this call to get bytes: http://src.couchbase.org/source/xref/6.6.2/goproj/src/github.com/couchbase/goxdcr/parts/xmem_nozzle.go#2426-2427
      3. At first glance, the above looks to be using a slice reference, but digging into gomemcached…
      4. http://src.couchbase.org/source/xref/6.6.2/godeps/src/github.com/couchbase/gomemcached/mc_req.go#104-118 <- this call actually make copies of the data

      So even if the stats do not move (i.e. the analysis I’ve shown), and yet target keeps returning ENOMEM (and potentially TMPFAIL), XDCR will keep retrying to resend…. but each resend actually allocates memory instead of reusing existing ones.

       

      Issue Resolution
      When target data nodes were undersized or consistently overwhelmed, XDCR memory usages could increase as it retried. Reallocated memory is now reused instead of generate garbage for GC to clean up.

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-56850
          # Subject Branch Project Status CR V

          Activity

            People

              ayush.nayyar Ayush Nayyar
              neil.huang Neil Huang
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty