[BP 7.1.4] - XDCR - Backfill Request Handler deadlock

Description

Looking at the stack trace we can see that there are a lot of go-routines stuck:

Looks like the backfill request handler is stuck on a wait, and unable to handle future requests.

That job is waiting on:

Which happens to be running and deadlocking on something when the handler for when explicit mapping is changed:

We’ve got ourselves a deadlock problem.

Components

Fix versions

Labels

Environment

None

Link to Log File, atop/blg, CBCollectInfo, Core dump

None

Release Notes Description

None

Attachments

1

is a backport of

Activity

Show:

Ayush Nayyar February 20, 2023 at 8:06 AM

Replicated the bug on 7.1.3-3480. Seeing a similar stack trace and seeing  panic: 

goxdcr.log: /home/couchbase/.cbdepscache/exploded/x86_64/go-1.18.7/go/src/runtime/panic.go:992 +0x71 fp=0xc0003bf3a0 sp=0xc0003bf370 pc=0x43a891

The UI hangs and becomes unusable. I am unable to list or create replications from the UI and REST. REST throws an unexpected server error: ["Unexpected server error, request logged."]

 

Validated the fix for this issue on 7.1.4-3585. Not seeing panics in logs or hangs in UI.

Neil Huang February 13, 2023 at 6:10 PM
Edited

Ritam Sharma February 13, 2023 at 3:14 PM

= Can you please help with steps to validate.

CB robot February 3, 2023 at 12:31 AM

Build couchbase-server-7.1.4-3584 contains goxdcr commit 4f10009 with commit message:
: reproduction test case and injection needed to hit

CB robot February 3, 2023 at 12:31 AM

Build couchbase-server-7.1.4-3584 contains goxdcr commit 8caede0 with commit message:
: Fix and unit test for backfill manager and pipelineMgr deadlock

Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Is this a Regression?

No

Triage

Untriaged

Story Points

Priority

Instabug

Open Instabug

PagerDuty

Sentry

Zendesk Support

Created February 2, 2023 at 5:59 PM
Updated December 19, 2023 at 8:36 PM
Resolved February 2, 2023 at 10:18 PM
Instabug