Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-42256

"seqno_acknowledge failed because this vbucket doesn't exist" disconnects replication

    XMLWordPrintable

    Details

    • Triage:
      Untriaged
    • Story Points:
      1
    • Is this a Regression?:
      No

      Description

      If the timing of a "seqno-ack" from consumer to producer is just right, it can find that the active vbucket has been deleted, the producer will respond to the consumer with "not-my-vbucket". The consumer will receive the response packet and error as follows:

      WARNING 121: Unsupported response packet received with opcode: 0x61 (DCP_SEQNO_ACKNOWLEDGED)
      

      The error happens because "mcbp" has no response handler configured for this opcode. This forces a disconnect of replication, which will fail a rebalance.

      Clearly from the code, a seqno-ack can generate a response and we should gracefully handle them to avoid unexpected disconnects and rebalance failure.

        Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

          Hide
          build-team Couchbase Build Team added a comment -

          Build couchbase-server-7.0.0-3717 contains kv_engine commit 30a1c9e with commit message:
          MB-42256: Handle seqno ack responses

          Show
          build-team Couchbase Build Team added a comment - Build couchbase-server-7.0.0-3717 contains kv_engine commit 30a1c9e with commit message: MB-42256 : Handle seqno ack responses
          Hide
          jwalker Jim Walker added a comment -

          This one has unreliable steps to validate, we never actually reproduced this issue, just fixed it from logs and code inspection.

          To reproduce would require cyles of rebalance whilst sync-writes are performed and hope that a vbucket is deleted before a seqno-ack.

          Show
          jwalker Jim Walker added a comment - This one has unreliable steps to validate, we never actually reproduced this issue, just fixed it from logs and code inspection. To reproduce would require cyles of rebalance whilst sync-writes are performed and hope that a vbucket is deleted before a seqno-ack.
          Hide
          ritam.sharma Ritam Sharma added a comment -

          Jim Walker - Please help with step to validate the issue.

          Show
          ritam.sharma Ritam Sharma added a comment - Jim Walker - Please help with step to validate the issue.
          Hide
          build-team Couchbase Build Team added a comment -

          Build couchbase-server-6.6.1-9159 contains kv_engine commit 30a1c9e with commit message:
          MB-42256: Handle seqno ack responses

          Show
          build-team Couchbase Build Team added a comment - Build couchbase-server-6.6.1-9159 contains kv_engine commit 30a1c9e with commit message: MB-42256 : Handle seqno ack responses

            People

            Assignee:
            jwalker Jim Walker
            Reporter:
            jwalker Jim Walker
            Votes:
            1 Vote for this issue
            Watchers:
            12 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Gerrit Reviews

                There are no open Gerrit changes

                  PagerDuty