Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-17241

Janitor activation of vbucket post rebalance failure results in unnoticed dead DCP stream

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 3.1.4
    • 3.0
    • ns_server
    • Security Level: Public
    • None
    • Untriaged
    • Unknown
    • ns_server: Dec 28 - Jan 22

    Description

      We saw this recently in production. Sequence of activities was:

      1. Rebalance failed while dcp takeover was in progress for some vBuckets. State of these vbuckets was dead on old master but not yet active on new master.
      2. Janitor showed up at some point to cleanup the situation - change the state of dead vbuckets to active among other things.
      3. Prior to setting the vbucket states the replications were cleaned up. As part of the clean up, new replications were established from the about-to-be-active-but-currenlty-dead vbucket to its replicas.
      4. Next, janitor changed the state of the vbucket to active in KV engine.
      5. Because of the vbucket state change, KV engine "marked the stream as dead" (i.e. sent a DCP_STREAM_END). This was not picked up by ns_server and no new replication stream was set up.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              ericcooper Eric Cooper (Inactive)
              dfinlay Dave Finlay
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty