Couchbase Lite Java replication hangs when using DEBUG console + file logging on Windows

Description

Summary

When using console-based logging alongside file-based logging at a low level (such as DEBUG) in a Couchbase Lite Java application on Windows, the replication eventually gets stuck due to waiting for a logging call that never returns.

Steps to reproduce

1. Create a simple Java app (source code attached as

) designed to synchronize a non-trivial number of documents (my tests reproduce after approximately 50000 documents) from a remote Sync Gateway, ideally with a good network to ensure network isn't the bottleneck.
Ideally add some way to keep track of replication progress (via the replicator status).
2. Enable console logging at debug-level and file-based logging:

3. Run the app on Windows.

Expected Result

The app synchronizes all of the documents as expected.

Actual Result

The app eventually stops logging from Couchbase Lite and the replication makes no further progress.

Analysis

I grabbed the thread dump via Jstack when the app got into this 'stuck' state (

) and can see the following:

As you can see in this, the reason that replication is making no further progress is because the thread responsible from consuming new messages from the websocket is stuck in Lite Core's logging method.
I tried to get more detail about what Lite Core is doing at this stage but I guess it wasn't compiled with debug symbols as the stacktrace in windbg.exe was just addresses.

Trying to understand the problem a bit better, I made a few adjustments and found that this does not occur in the following situations:

  • When you use VERBOSE level instead of DEBUG-level, implying an issue specifically with DEBUG level or just the volume of logging generated.

  • When you use only console-based or only file-based logging, implying there is a potential deadlock caused by having 2 outputs.

  • When the application is running on Linux, implying this is Windows-specific.

Workaround

Do not enable debug logging for console logging.

Attachments

3

Activity

Show:

Blake Meike April 2, 2021 at 6:52 PM

I believe this was fixed in 2.8

Blake Meike August 4, 2020 at 6:36 PM

Hey , where are we on this?

I have seen hangs like this but I haven't seen one in the 2.8 code, for quite a while. I'm hoping that the problem has been fixed. If you can reproduce it or, better yet, explain how we can reproduce it, we could determine whether or not the bug still exists. The thread dump from this ticket doesn't really show any deadlock. It is certainly possible that there is something going on in Core but, if it is there, we'll need to see it to fix it.

Re-reading, I think I probably misunderstood your comment about verifying that both console and file logging were enabled. Even doing that, however, I have not seen a problem like this in a while.

Blake Meike May 28, 2020 at 4:29 PM

That really shouldn't be necessary.  If you can demonstrate that, please report it as a bug.  You shouldn't need to touch file-based logging at all, to get full console logging.  Just set

Database.log.getConsole().setLevel(LogLevel.DEBUG)

and, perhaps

Database.log.getConsole().setDomains(LogDomain.ALL_DOMAINS)

The CBL system uses a lot of threads.  Each may log, from time to time.  That is why there are two threads logging.

Matt Carabine May 28, 2020 at 9:37 AM

Did you definitely have both console logging and file-based logging enabled and set to debug level? From my testing it only triggered when both were enabled and set to debug level, possibly why there are two threads logging?

Matt Carabine May 28, 2020 at 9:36 AM

I'd have to go and setup all of the backend again, I nuked it because it seemed to just be related to dataset size and was reliably reproducible every time.

It will probably take some time as I had it all setup in AWS, I'll get back to you when I get time to set it all up again.

Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Story Points

Components

Sprint

Fix versions

Affects versions

Priority

Instabug

Open Instabug

PagerDuty

Sentry

Zendesk Support

Created May 1, 2020 at 11:44 AM
Updated June 14, 2021 at 4:36 PM
Resolved April 2, 2021 at 6:52 PM
Instabug