I present, as evidence, the following:
(This issue is fairly simple to reproduce. I've included some things that I think might be useful, here, but I can probably get whatever you need, if this is insufficient.)
This code will log a message on the return from the native method `closed`:
protected final void closed(int errorDomain, int errorCode, String message) {
|
final long handle = getPeerUnchecked();
|
Log.d(LOG_DOMAIN, "C4Socket.closed @%d: %d", handle, errorCode);
|
if (handle == 0) { return; }
|
closed(handle, errorDomain, errorCode, message);
|
Log.d(LOG_DOMAIN, "C4Socket.close complete");
|
}
|
Here is the native method, FWIW:
JNIEXPORT void JNICALL
|
Java_com_couchbase_lite_internal_core_C4Socket_closed(
|
JNIEnv *env,
|
jclass ignore,
|
jlong jSocket,
|
jint domain,
|
jint code,
|
jstring message) {
|
auto socket = (C4Socket *) jSocket;
|
jstringSlice sliceMessage(env, message);
|
C4Error error = c4error_make((C4ErrorDomain) domain, code, sliceMessage);
|
c4socket_closed(socket, error);
|
}
|
Here is a bit of the log:
Sep 29, 2021, 2:02:12 PM 57(Thread-30) E CouchbaseLite/NETWORK:{C4SocketImpl#7} No response received after 15 sec -- disconnecting
|
Sep 29, 2021, 2:02:12 PM 58(Thread-31) D CouchbaseLite/NETWORK:C4Socket.requestClose @140678238411008: AbstractCBLWebSocket{ws://127.0.0.1:4984/todo/_blipsync} #504(Timed out)
|
Sep 29, 2021, 2:02:12 PM 58(Thread-31) V CouchbaseLite/NETWORK:AbstractCBLWebSocket{ws://127.0.0.1:4984/todo/_blipsync}:Core request close: null 504
|
Sep 29, 2021, 2:02:12 PM 58(Thread-31) D CouchbaseLite/DATABASE:StateMachine@0x3c85bacb: transition CONNECTING => CLOSING
|
Sep 29, 2021, 2:02:12 PM 58(Thread-31) D CouchbaseLite/DATABASE:StateMachine@0x3c85bacb: transition CLOSING => CLOSED
|
Sep 29, 2021, 2:02:12 PM 58(Thread-31) I CouchbaseLite/NETWORK:WebSocket CLOSED: 504(Timed out)
|
Sep 29, 2021, 2:02:12 PM 58(Thread-31) D CouchbaseLite/NETWORK:C4Socket.closed @140678238411008: 504
|
As you can see, Thread-31 enters the native method `closed` but does not leave.
Thread-31 is in the native method "closed" an is not waiting on any java mutex:
"Thread-31" #58 prio=5 os_prio=31 cpu=445.63ms elapsed=7.77s tid=0x00007ff235294000 nid=0x370b runnable [0x0000700007ef2000]
|
java.lang.Thread.State: RUNNABLE
|
at com.couchbase.lite.internal.core.C4Socket.closed(Native Method)
|
at com.couchbase.lite.internal.core.C4Socket.closed(C4Socket.java:232)
|
at com.couchbase.lite.internal.replicator.AbstractCBLWebSocket.didClose(AbstractCBLWebSocket.java:601)
|
at com.couchbase.lite.internal.replicator.AbstractCBLWebSocket.requestClose(AbstractCBLWebSocket.java:420)
|
- locked <0x000000070d559d30> (a com.couchbase.lite.internal.utils.StateMachine)
|
at com.couchbase.lite.internal.core.C4Socket.requestClose(C4Socket.java:141)
|
|
It does hold the mutex for the variable named `state` (an instance of StateMachine) which prevents any other thread from making forward progress through the websocket code.
I present, as evidence to the contrary, an existing test which does exactly this:
https://github.com/couchbase/couchbase-lite-core/blob/18f3368186875461c49a559705f9a791e31c6fb9/Replicator/tests/ReplicatorAPITest.cc#L590-L594
So something else is likely going on