Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-4461

replication cursor stuck from slave 1 to slave 2 , hence high number of checkpoint items in slave1

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 2.0-beta
    • 1.7.2
    • couchbase-bucket
    • Security Level: Public
    • Ubuntu 10.04.3 LTS x86_64, Membase 1.7.2, Amazon m1.large instances, 16-node cluster.

    Description

      Two out of sixteen nodes are ejecting active items because their mem_used is above the high water mark. The other nodes are well below. Customer says that keys are of various sizes, but the larger ones should be spread out randomly across the different nodes. Number of keys on all nodes is roughly equal.

      The two problem nodes show ep_value_size much larger than a healthy node. However, looking at the sqlite data files, there's no significant difference in size of the files on disk (as seen, for example, in */membase.log).

      FYI, the rise in data size seems to have started on these two nodes after a different node, 10.254.7.150, stopped responding to REST and membase was restarted (with 'service membase-server restart').

      The mbcollect_info data for these servers are in the S3 . The logs are named:

      membase 16: a good node, for comparison
      membase 07 and membase 14: the trouble nodes that are ejecting items due to large memory usage
      membase 11: the node that was restarted on Saturday

      Can someone please take a look at this, and help me understand why the ep_value_size might be bloating up for these two nodes?

      Thanks,

      Tim

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            mikew Mike Wiederhold [X] (Inactive)
            TimSmith Tim Smith (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              PagerDuty