Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-51064

Detect & Alert users on non-homogenous-disk-performance across nodes in the cluster

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Unresolved
    • Major
    • feature-backlog
    • 7.0.3, 7.1.0
    • ns_server, UI
    • None
    • 1

    Description

      As was seen in some CBSEs, when different nodes in the cluster have different disk performance, it can cause users to experience unexpected behaviours from different couchbase services which are hard to troubleshoot & explain. In one case, it resulted in periodic spikes in num_docs_pending to be indexed which can also potentially cause stale=false scans to time out. Often, it takes a long time to RCA and attribute reported symptoms to underlying disk issues and multiple teams need to get involved.

      Would be good if this condition can be detected by the server automatically, an alert raised on the server UI (perhaps system events too), data collected in cbcollect and mortimer highlight the issue when a supportal snapshot is uploaded. This will help multiple stake holders (customers, support, engineering etc).

      This is applicable to all couchbase services that store data. I added ns_server & UI as the components to begin with as I thought thats probably where we should start. We can add/modify more components as necessary after we spend sometime figuring out how to go about this platform wide.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              Abhijeeth.Nuthan Abhijeeth Nuthan
              jeelan.poola Jeelan Poola
              Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty