Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-8211

making lots of requests in general to config/views/XDCR, causes unexpected resource usage and/or crashing owing to OOM

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Critical
    • 7.0.0
    • 2.0
    • ns_server
    • Security Level: Public
    • Note this is not centos-64 bit, but we now require OS as a field even if it's OS independent.
    • Untriaged
    • Centos 64-bit
    • Yes
    • 02/Sep/2013 - 20/Sep/2013

    Description

      Going into version 2.0, we'd integrated changes that make handling of many streaming connections much less expensive in terms of memory, but there is some evidence that creating a lot of them at the same time or creating a large number of streaming connections could cause instability and crashing in ns_server.

      Note that in the case of PHP deployments, we've found the number of worker processes to be 256, 512, or 5000 processes in some deployments. That is per server.

      Then, deployments can be 50 or more systems.

      Thus, we could have 12,000+ or 25,000+ or 250,000 processes which may all try to connect to a cluster.

      This issue is to track possible solutions to this kind of thing occurring.

      Possible solutions could be to drop connections or to only allow a number of connections tested to work in our minimum system configuration.

      Note that at the client library side we have a method of sharing the configuration between multiple processes. However, it is not currently a complete solution and may never be because:

      • It is not on by default as it requires the configuration of a file path.
      • In some cases, during a rebalance, we can still have a rush of processes trying to get a configuration.
      • We do not currently have a good solution for updating the configuration quickly if it's a memcached type bucket, as there is no NOT_MY_VBUCKET response.

      Even if we can solve all of those issues at the client library side, a bug or an very large number of client systems can still cause problems, so we will need ns_server to be reliable even in the face of lots of requests.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            Aliaksey Artamonau Aliaksey Artamonau (Inactive)
            ingenthr Matt Ingenthron
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty