Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-5428

Couchbase server failed to evict items when mem_used surpass high water mark because of a rare operation deadlock in ep-engine between tap consumers/producers

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 2.0-beta
    • 1.8.1
    • couchbase-bucket
    • Security Level: Public
    • 2 node with 15G physical memory. Mem quota = 12G. 3 clients were trying to load 7M items in total.

    Description

      From Mike:

      This issue is an operational deadlock that we are already aware of and have fixed in 2.0. It is not a regression from 1.8. It is caused by items being loaded into Couchbase at a very fast rate. On a two node cluster the each node surpasses 90% memory used. This causes the tap consumers to tell the producers to back off since they will not be accepting data. At the same time the item pager is running and trying to evict items, but it is unable to because all of values in memory are waiting in the checkpoint queues to be replicated.

      This fix can potentially be back-ported, but Chiyoung feels like this may be too risky for 1.8.1.

      From Ronnie:

      The ec2 cluster had two nodes with 12G mem quota each. When memory usage reached 90%, clients suffered backoff signals from server (errorno: 134). And the cluster wouldn't recover from this point. (running for more than 12 hours). And the highest cluster-wide ops was around 6k

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            mikew Mike Wiederhold [X] (Inactive)
            ronnie Ronnie Sun (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty