Replicator is stuck in busy state when there is an error thrown while applying delta to create full fleece doc

Description

According to https://couchbasecloud.atlassian.net/browse/CBSE-14057#icft=CBSE-14057, the replicator was stuck in the busy state and from the log there are 3 Invalid delta Exception throw as the following example:

 

2023-03-27 09:54:03.747320-0700 InspectQA[86248:5418129] CouchbaseLite Replicator Verbose: {IncomingRev#166} Need to apply delta immediately for 'corrective-assignment-gc-120598123' #3-8f07875796c39d780a151b87ae8895ae ... 2023-03-27 09:54:03.747316-0700 InspectQA[86248:5418133] CouchbaseLite Network Verbose:     (...sent 136 bytes) 2023-03-27 09:54:03.747493-0700 InspectQA[86248:5392713] CouchbaseLite Replicator Verbose: {IncomingRev#164} Received revision 'corrective-assignment-gc-122506241' #3-b481f098e7056e8865f4bc1314429f42 (seq '59105725') 2023-03-27 09:54:04.422017-0700 InspectQA[86248:5418130] CouchbaseLite Network Verbose: {Connection#4} Finished receiving 'changes' REQ #15382 Z 2023-03-27 09:54:04.422196-0700 InspectQA[86248:5417861] CouchbaseLite Database Verbose: {DB#22} commit transaction 2023-03-27 09:54:04.422388-0700 InspectQA[86248:5418129] CouchbaseLite Replicator ERROR: {IncomingRev#166} Threw C++ exception: Invalid delta

From the log, after a while, here is the busy status:

2023-03-27 10:01:05.079499-0700 InspectQA[86248:5429221] CouchbaseLite Replicator Info: CBLReplicator[<*> URL[wss://mradqa.digicat.cloud.pge.com/sync_gateway/asset360]] is busy, progress 15160/15163, error: (null)

Analysis:

1. Here is the line that throw the exception. I do not know from the log about the root cause of "invalid delta"

https://github.com/couchbase/couchbase-lite-core/blob/release/lithium/Replicator/DBAccess.cc#L384

2. From the log message "Need to apply delta immediately for", the code path are as follows:

https://github.com/couchbase/couchbase-lite-core/blob/release/lithium/Replicator/IncomingRev.cc#L174-L176

3. To reproduce the issue, instead of trying to make the replicator failed with delta error, I manually change the code to throw an error from the inside IncomingRev::parseAndInsert(alloc_slice jsonBody) function here. 

https://github.com/couchbase/couchbase-lite-core/blob/release/lithium/Replicator/IncomingRev.cc#L159

After that running a pull replication to pull a few docs from db-to-db replication, the replicator will end up stuck in busy state.

4. If there is an exception thrown from `IncomingRev::parseAndInsert(alloc_slice jsonBody) function `, the error doesn't get handled properly. Normally, when there is an error, not exception, the error and the pull rev will be handled in failWithError().

Things to discuss:

1. When there is an exception thrown from applyDelta(), is it permanent or recoverable error?

2. When there is an exception thrown from parseAndInsert() besides from applyDelta(), is it permanent or recoverable error?

Activity

Show:

CB robot June 9, 2023 at 1:27 AM

Build couchbase-lite-c-3.1.1-3 contains couchbase-lite-core commit 20083db with commit message:
https://couchbasecloud.atlassian.net/browse/CBL-4445#icft=CBL-4445: Replicator may get stuck when there is an error of "Invalid delta" (#1801)

CB robot June 7, 2023 at 4:18 PM

Build couchbase-lite-ios-3.1.1-3 contains couchbase-lite-core commit 20083db with commit message:
https://couchbasecloud.atlassian.net/browse/CBL-4445#icft=CBL-4445: Replicator may get stuck when there is an error of "Invalid delta" (#1801)

CB robot June 6, 2023 at 10:29 PM

Build couchbase-lite-core-3.1.1-6 contains couchbase-lite-core commit 20083db with commit message:
https://couchbasecloud.atlassian.net/browse/CBL-4445#icft=CBL-4445: Replicator may get stuck when there is an error of "Invalid delta" (#1801)

Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Labels

Story Points

Components

Sprint

Fix versions

Affects versions

Priority

Instabug

Open Instabug

PagerDuty

Sentry

Zendesk Support

Created April 18, 2023 at 3:35 PM
Updated August 31, 2024 at 10:56 AM
Resolved June 7, 2023 at 12:18 AM
Instabug