Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-47777

[BP 7.0.2] - XDCR - backfill_request_handler could hang forever

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • No

    Description

      Backfill request handler's cooldown mechanism is flawed such that it could cause the run() routine to run the persist case when there is no operations to persist.

      This can lead to the handler stuck forever waiting for an operation that will never come, and all backfill operations will be unresponsive... such as handling VB done events, or raising future backfills, etc

       

      A typical symptom would be a backfill pipeline that hangs and doesn't go away (potentially with changes_left staying at 0)

      The stack trace would show a bunch of go-routines doing HandleVBTaskDone(), (each one per VB), and one go routine stuck at this location:

      https://github.com/couchbase/goxdcr/blob/26d8add3a1c760f1c0c99569a4582e7b7c09c689/backfill_manager/backfill_request_handler.go#L296

      			// No more incoming requests - done bursting handling, do a single metakv operation
      			select {
      			case persistType := <-b.persistenceNeededCh: 
      				err := b.metaKvOp(persistType)
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            neil.huang Neil Huang created issue -
            neil.huang Neil Huang made changes -
            Field Original Value New Value
            Link This issue Clones MB-47773 [ MB-47773 ]
            neil.huang Neil Huang made changes -
            Link This issue is a backport of MB-47773 [ MB-47773 ]
            neil.huang Neil Huang made changes -
            Link This issue Clones MB-47773 [ MB-47773 ]
            neil.huang Neil Huang made changes -
            Fix Version/s 7.0.1 [ 17104 ]
            Fix Version/s Neo [ 17615 ]
            Affects Version/s Neo [ 17615 ]
            wayne Wayne Siu made changes -
            Link This issue blocks MB-46308 [ MB-46308 ]
            wayne Wayne Siu made changes -
            Labels backport-candidate approved-for-7.0.1
            neil.huang Neil Huang made changes -
            Assignee Neil Huang [ neil.huang ] Pavithra Mahamani [ pavithra.mahamani ]
            Resolution Fixed [ 1 ]
            Status Open [ 1 ] Resolved [ 5 ]
            lynn.straus Lynn Straus made changes -
            Fix Version/s 7.0.2 [ 18012 ]
            lynn.straus Lynn Straus made changes -
            Fix Version/s 7.0.1 [ 17104 ]
            pavithra.mahamani Pavithra Mahamani made changes -
            Labels approved-for-7.0.1 approved-for-7.0.1 request-dev-verify
            pavithra.mahamani Pavithra Mahamani made changes -
            Assignee Pavithra Mahamani [ pavithra.mahamani ] Neil Huang [ neil.huang ]
            neil.huang Neil Huang made changes -
            Status Resolved [ 5 ] Closed [ 6 ]
            ianmccloy Ian McCloy made changes -
            Summary [BP 7.0.1] - XDCR - backfill_request_handler could hang forever [BP 7.0.2] - XDCR - backfill_request_handler could hang forever

            People

              neil.huang Neil Huang
              neil.huang Neil Huang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty