Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: 7.6.0
Affects Version/s: 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.0.4, 7.1.4, 7.0.5, 7.1.0, 7.1.1, 7.1.2, 7.1.3
Component/s: tools
Labels:
- couchbase

Triage:
Untriaged
Story Points:
0
Is this a Regression?:
No

Description

What's the issue?
Since ~~MB-41372~~, we don't propagate errors for cbimport in the restore worker pool like we do for cbbackupmgr. James Lee says this was for a couple of reasons:

There’s some errors you can’t really detect will happen e.g. the imported data producing an invalid packet when being sent.
You want the import to be resilient to failures to import a single item (e.g. too big, not valid json)

This results in some scenarios - such as a full ephemeral bucket or the bucket being deleted from under us - taking an extremely long time to fail/error out, because each document must exhaust it's 5 minute retries.

What's the fix?
We could try to handle these issues individually, e.g. by detecting when a bucket is deleted using /pools/default. This feels like it may become a whack-a-mole effort however. A better way might be to revisit the reasons for not propagating the errors and attacking those so we can propagate errors once again. As an example we already validate that the document is valid JSON/UTF-8.

Attachments

Issue Links

is triggered by

MB-41372 cbimport may hang after receiving a max retries error

Closed

relates to

MB-58093 Rebalance button is not clickable after deletion of travel-sample bucket and addition of a node(deletion of tavel-sample while data loading was going on)

Closed

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Gilad Kalchheim

Reporter:: James Lee

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 03/Mar/23 9:03 AM

Updated:: 05/Feb/24 2:55 AM

Resolved:: 30/Oct/23 5:07 AM

Gerrit Reviews

There are no open Gerrit changes

Show There is 1 closed Gerrit change

Hide There is 1 closed Gerrit change

MB-55818 Always propagate errors from pool worker: Gerrit Review:

[Import] "Hang" caused when importing due to error propagation

Details

Description

Attachments

Issue Links

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty