Uploaded image for project: 'Couchbase Lite'
  1. Couchbase Lite
  2. CBL-5791

Socket was not called to close after receiving WebSocket PING Timeout

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 3.1.9
    • 3.1.4
    • LiteCore
    • Security Level: Public
    • None
    • 2

    Description

      From the CBSE-17185, the socket was half closed (caused by by unplugging the network cable) and from the provided log, the WebSocket kept sending PING message even though the PONG message was not received and the timeout happened.

      Moreover, every time the PING message was timed out, there was a log message as below but the WebSocketImpl doesn't actually close the socket.

      [ERROR][couchbase_manager][{BuiltInWebSocket#8} No response received after 10 sec -- disconnecting] 

      In general, WebSocket's PING/PONG message is also a mechanism for detecting the half close socket. For the CBSE ticket case, the half close socket was detected as the PING message was timed out but the current implementation just seems to ignore the timed out by doing nothing but log the disconnecting message as shown above.

      Analysis
      1. The timedOut() function linked below only calls to close the socket when the socket life cycle state is SOCKET_OPENING.

      https://github.com/couchbase/couchbase-lite-core/blob/release/3.1/Networking/WebSockets/WebSocketImpl.cc#L396-L415

      2. From the WebSocketImpl code, It seems like the timedOut() should also call to close the socket when the socket life cycle state is SOCKET_OPENED as well. If it does that it will close the socket when the PING message is timed out.

      3. I have also noticed that the following code also starts the response timeout when the state is SOCKET_OPENED. As a result, if the response timeout happens, the code will do anothing but log the same way when the PING message is timed out.

      https://github.com/couchbase/couchbase-lite-core/blob/release/3.1/Networking/WebSockets/WebSocketImpl.cc#L496

      How to reproduce the issue?
      1. Comment out this line in LiteCore code so that the PING message is not actually sent and that will cause the PING timeout.

      https://github.com/couchbase/couchbase-lite-core/blob/release/3.1/Networking/WebSockets/WebSocketImpl.cc#L379

      2. Start a continuous replicator (I have tested it with a pull replicator) connecting to a Sync Gateway. I would recommend to set the heartbeat to 30 seconds when creating the replicator otherwise the default heartbeat will be 5 mins.

      3. Wait until the replicator is IDLE and then the ping message will be sent according to the heartbeat interval set.

      4. The PING message timeout is 10 seconds so wait for another 10 seconds to see the timeout and the disconnecting log message as mentioned above.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              callum.birks Callum Birks
              pasin Pasin Suriyentrakorn
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty