Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-41108

Handle increase in RSS after rollback to snapshot in MOI

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 7.0.0
    • 6.0.2
    • storage-engine
    • Untriaged
    • 1
    • Unknown
    • Plasma-Sprint-May-30-2021

    Description

      When indexes rollback to a disk-snapshot, there seems to be a sudden increase in RSS. The issue can be reproduced locally using the following steps:

      a. Set up a cluster with 4 nodes - 3KV, 1 index+n1ql nodes

      b. Populate 30M docs in the bucket - large number of documents are required to notice visible difference in RSS change after rollback

      c. Create and build some indexes

      d. Block replication from KV node1 from KV node2

      e. Perform some mutations (~1000)

      f. Failover KV node2

      Failover of KV node2 would cause a rollback to disk snapshot. Notice the increase in RSS after rolling back to disk snapshot.

      The increase in RSS seems to be coming because of JEMalloc fragmentation. After rollback, the bin utilisation seems to go down because of which an increase in RSS is observed.

      In MOI, when there is a rollback to disk snapshot, the existing indexed data will be cleared and the entire index data will be re-built from disk snapshot and DCP. The close of main store and loading snapshot happens concurrently. This could be the possible cause for high fragmentation.

      The following things can be investigated:

      a. The performance penalty of closing the mainstore synchronously i.e. first close the mainstore and then load the snapshot

      b. Identify other possible ways to minimise JEMalloc fragmentation

       

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            girish.benakappa Girish Benakappa
            varun.velamuri Varun Velamuri
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty