Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-47773

XDCR - backfill_request_handler could hang forever

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 7.1.0
    • 7.0.0, 7.1.0
    • XDCR
    • Untriaged
    • 1
    • No

    Description

      Backfill request handler's cooldown mechanism is flawed such that it could cause the run() routine to run the persist case when there is no operations to persist.

      This can lead to the handler stuck forever waiting for an operation that will never come, and all backfill operations will be unresponsive... such as handling VB done events, or raising future backfills, etc

       

      A typical symptom would be a backfill pipeline that hangs and doesn't go away (potentially with changes_left staying at 0)

      The stack trace would show a bunch of go-routines doing HandleVBTaskDone(), (each one per VB), and one go routine stuck at this location:

      https://github.com/couchbase/goxdcr/blob/26d8add3a1c760f1c0c99569a4582e7b7c09c689/backfill_manager/backfill_request_handler.go#L296

      			// No more incoming requests - done bursting handling, do a single metakv operation
      			select {
      			case persistType := <-b.persistenceNeededCh: 
      				err := b.metaKvOp(persistType)
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              neil.huang Neil Huang
              neil.huang Neil Huang
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty