Details
Description
According to CBSE-14057, the replicator was stuck in the busy state and from the log there are 3 Invalid delta Exception throw as the following example:
2023-03-27 09:54:03.747320-0700 InspectQA[86248:5418129] CouchbaseLite Replicator Verbose: {IncomingRev#166} Need to apply delta immediately for 'corrective-assignment-gc-120598123' #3-8f07875796c39d780a151b87ae8895ae ... |
2023-03-27 09:54:03.747316-0700 InspectQA[86248:5418133] CouchbaseLite Network Verbose: (...sent 136 bytes) |
2023-03-27 09:54:03.747493-0700 InspectQA[86248:5392713] CouchbaseLite Replicator Verbose: {IncomingRev#164} Received revision 'corrective-assignment-gc-122506241' #3-b481f098e7056e8865f4bc1314429f42 (seq '59105725') |
2023-03-27 09:54:04.422017-0700 InspectQA[86248:5418130] CouchbaseLite Network Verbose: {Connection#4} Finished receiving 'changes' REQ #15382 Z |
2023-03-27 09:54:04.422196-0700 InspectQA[86248:5417861] CouchbaseLite Database Verbose: {DB#22} commit transaction |
2023-03-27 09:54:04.422388-0700 InspectQA[86248:5418129] CouchbaseLite Replicator ERROR: {IncomingRev#166} Threw C++ exception: Invalid delta |
From the log, after a while, here is the busy status:
2023-03-27 10:01:05.079499-0700 InspectQA[86248:5429221] CouchbaseLite Replicator Info: CBLReplicator[<*> URL[wss://mradqa.digicat.cloud.pge.com/sync_gateway/asset360]] is busy, progress 15160/15163, error: (null) |
Analysis:
1. Here is the line that throw the exception. I do not know from the log about the root cause of "invalid delta"
https://github.com/couchbase/couchbase-lite-core/blob/release/lithium/Replicator/DBAccess.cc#L384
2. From the log message "Need to apply delta immediately for", the code path are as follows:
3. To reproduce the issue, instead of trying to make the replicator failed with delta error, I manually change the code to throw an error from the inside IncomingRev::parseAndInsert(alloc_slice jsonBody) function here.
https://github.com/couchbase/couchbase-lite-core/blob/release/lithium/Replicator/IncomingRev.cc#L159
After that running a pull replication to pull a few docs from db-to-db replication, the replicator will end up stuck in busy state.
4. If there is an exception thrown from `IncomingRev::parseAndInsert(alloc_slice jsonBody) function `, the error doesn't get handled properly. Normally, when there is an error, not exception, the error and the pull rev will be handled in failWithError().
Things to discuss:
1. When there is an exception thrown from applyDelta(), is it permanent or recoverable error?
2. When there is an exception thrown from parseAndInsert() besides from applyDelta(), is it permanent or recoverable error?