Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-60736

Handle GetDcpSeqnos taking more time in the event of KV node failover

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • Morpheus
    • 7.1.3
    • secondary-index
    • None
    • Untriaged
    • 0
    • Unknown

    Description

      In a customer setup, the following issue has been observed:

      a. KV node has failed over

      b. This KV node hosted ephemeral bucket. Fail over has triggered rollback on indexer

      c. Timekeeper::handleStats is trying to get seqnos. from memcached but the get seqnos. call is timing out because of KV node fail over

      d. During this 2 minute window, indexer has changed the rollback timestamp but the change is not propagated to client as timekeeper is stuck

      All scans in this 2 minute window have failed with rollback time mismatch error. Since KV node failover leads to rollback on all index replicas in this call, non of the replicas could serve the scans leading to service un-availability

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            Unassigned Unassigned
            varun.velamuri Varun Velamuri
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty