Uploaded image for project: 'Couchbase .NET client library'
  1. Couchbase .NET client library
  2. NCBC-1761

kv async operation does not recover after 1 node swap rebalance

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.6.2
    • Component/s: None
    • Labels:
      None

      Description

      commit : 84db0f4

      SDK log is http://sdk-testresults.couchbase.com.s3.amazonaws.com/2018941946521247-1.gz

      Test was done with 1 node swap rebalance while kv ASYNC operation is running.
      Client is running on .net 4.5.2 (not netcore)
      Expected: operation keeps running with no issue
      Actual: operation does not recover.

        Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

          Hide
          jaekwon.park Jae Park [X] (Inactive) added a comment -

          attached the log with trace log level.
          To explain the log, According to jenkins time stamp,
          Rebalance begin : 22:23:56 (22:23:26 in SDK log)
          Rebalance complete : 22:25:13 (22:24:47 in SDK log)
          Added : 172.23.120.17
          Removed : 172.23.120.247

          Looks like jenkins time stamp is 30 seconds ahead of SDK log time stamp.

          In the log, Line 883396 seems the line when rebalance begins and got new config from the server
          Config changed (forced:False) new Rev#977 | old Rev#975 CCCP

          and another at line 884144
          Config changed (forced:False) new Rev#977 | old Rev#970 CCCP

          another at line 884380
          Config changed (forced:False) new Rev#977 | old Rev#960 CCCP

          another at line 884510
          Config changed (forced:False) new Rev#977 | old Rev#972 CCCP

          another at line 884809
          Config changed (forced:False) new Rev#977 | old Rev#970 CCCP <- this is same rev as line 884144

          continues till 22:24:14 (SDK log time stamp)
          from 22:24:22(SDK log time stamp), upsert async response time gradually increase to 100 ms range then continues grow and at the end of the test, it grew to 107 seconds(107003 ms)

          Things I noticed are
          1. node(172.23.120.17) to be added shows up in 22:23:34 (sdk log time stamp) which is about 12 seconds after rebalance begin.
          2. new node shows up from the early rebalance process (does this mean the new node takes traffic from before rebalance complete?)
          3. This exception starts while adding new node.
          2018-09-05 22:23:34,276 [29] DEBUG Couchbase.IO.Services.PooledIOService - System.ObjectDisposedException: Safe handle has been closed
          at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success)
          at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success)
          at Microsoft.Win32.Win32Native.ResetEvent(SafeWaitHandle handle)
          at System.Threading.EventWaitHandle.Reset()
          at Couchbase.IO.SyncState.CleanForReuse()
          at Couchbase.IO.MultiplexingConnection.ReleaseState(SyncState state)
          at Couchbase.IO.MultiplexingConnection.Send(Byte[] request)
          at Couchbase.IO.Services.PooledIOService.Execute[T](IOperation`1 operation)
          2018-09-05 22:23:34,276 [29] INFO Couchbase.Core.Server - Checking if node 172.23.120.17:11210 should be down - last: 22:23:19.7150718, current: 22:23:34.2760371, count: 2
          2018-09-05 22:23:34,276 [29] DEBUG Couchbase.Configuration.Server.Monitoring.ConfigMonitor - Waiting to check configs...
          2018-09-05 22:23:34,276 [36] INFO Sdkd.Commands.CommandBase - commandID:cccfa555-cd0b-4d6b-b256-cfc9626cd6fc:SENT
          2018-09-05 22:23:34,276 [16] DEBUG Couchbase.IO.Services.PooledIOService - System.ObjectDisposedException: Safe handle has been closed
          at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success)
          at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success)
          at Microsoft.Win32.Win32Native.ResetEvent(SafeWaitHandle handle)
          at System.Threading.EventWaitHandle.Reset()
          at Couchbase.IO.SyncState.CleanForReuse()
          at Couchbase.IO.MultiplexingConnection.ReleaseState(SyncState state)
          at Couchbase.IO.MultiplexingConnection.Send(Byte[] request)
          at Couchbase.IO.Services.PooledIOService.Execute[T](IOperation`1 operation)

          018-09-05 22:23:34,291 [29] DEBUG Couchbase.IO.Services.PooledIOService - System.ObjectDisposedException: Safe handle has been closed
          at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success)
          at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success)
          at Microsoft.Win32.Win32Native.ResetEvent(SafeWaitHandle handle)
          at System.Threading.EventWaitHandle.Reset()
          at Couchbase.IO.SyncState.CleanForReuse()
          at Couchbase.IO.MultiplexingConnection.ReleaseState(SyncState state)
          at Couchbase.IO.MultiplexingConnection.Send(Byte[] request)
          at Couchbase.IO.Services.PooledIOService.Execute[T](IOperation`1 operation)

          2018-09-05 22:23:34,323 [16] DEBUG Couchbase.IO.Services.PooledIOService - System.ObjectDisposedException: Safe handle has been closed
          at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success)
          at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success)
          at Microsoft.Win32.Win32Native.ResetEvent(SafeWaitHandle handle)
          at System.Threading.EventWaitHandle.Reset()
          at Couchbase.IO.SyncState.CleanForReuse()
          at Couchbase.IO.MultiplexingConnection.ReleaseState(SyncState state)
          at Couchbase.IO.MultiplexingConnection.Send(Byte[] request)
          at Couchbase.IO.Services.PooledIOService.Execute[T](IOperation`1 operation)
          2018-09-05 22:23:34,323 [16] INFO Couchbase.Core.Server - Checking if node 172.23.120.17:11210 should be down - last: 22:23:19.7680398, current: 22:23:34.3230393, count: 2
          2018-09-05 22:23:34,323 [16] DEBUG Couchbase.Configuration.Server.Monitoring.ConfigMonitor - Waiting to check configs...
          2018-09-05 22:23:34,323 [16] DEBUG Couchbase.Configuration.Server.Monitoring.ConfigMonitor - Checking configs...
          2018-09-05 22:23:34,323 [188] WARN Couchbase.IO.ConnectionBase - Handling disconnect for connection 0dd7216e-323e-4376-90bc-a0ac4d5505ce: Couchbase.IO.RemoteHostClosedException: The remote host (172.23.120.17:11210) has gracefully closed this connection.
          2018-09-05 22:23:34,323 [16] DEBUG Couchbase.Configuration.Server.Monitoring.ConfigMonitor - Using index 3 - server Couchbase.Core.Server
          2018-09-05 22:23:34,323 [188] INFO Couchbase.IO.ConnectionBase - Closing connection 0dd7216e-323e-4376-90bc-a0ac4d5505ce
          2018-09-05 22:23:34,323 [16] DEBUG Couchbase.Core.Server - Sending Config with key using server 172.23.120.17:11210
          2018-09-05 22:23:34,323 [16] DEBUG Couchbase.IO.SharedConnectionPool`1[[Couchbase.IO.MultiplexingConnection, Couchbase.NetClient, Version=2.5.7.0, Culture=neutral, PublicKeyToken=null]] - Trying to acquire new connection! Refs=1
          2018-09-05 22:23:34,323 [29] DEBUG Couchbase.IO.Services.PooledIOService - System.ObjectDisposedException: Safe handle has been closed
          at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success)
          at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success)
          at Microsoft.Win32.Win32Native.ResetEvent(SafeWaitHandle handle)
          at System.Threading.EventWaitHandle.Reset()
          at Couchbase.IO.SyncState.CleanForReuse()
          at Couchbase.IO.MultiplexingConnection.ReleaseState(SyncState state)
          at Couchbase.IO.MultiplexingConnection.Send(Byte[] request)
          at Couchbase.IO.Services.PooledIOService.Execute[T](IOperation`1 operation)

          2018-09-05 22:23:34,370 [17] DEBUG Couchbase.IO.Services.PooledIOService - System.ObjectDisposedException: Safe handle has been closed
          at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success)
          at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success)
          at Microsoft.Win32.Win32Native.ResetEvent(SafeWaitHandle handle)
          at System.Threading.EventWaitHandle.Reset()
          at Couchbase.IO.SyncState.CleanForReuse()
          at Couchbase.IO.MultiplexingConnection.ReleaseState(SyncState state)
          at Couchbase.IO.MultiplexingConnection.Send(Byte[] request)
          at Couchbase.IO.Services.PooledIOService.Execute[T](IOperation`1 operation)
          2018-09-05 22:23:34,370 [149] DEBUG Couchbase.IO.Services.PooledIOService - System.ObjectDisposedException: Safe handle has been closed
          at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success)
          at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success)
          at Microsoft.Win32.Win32Native.ResetEvent(SafeWaitHandle handle)
          at System.Threading.EventWaitHandle.Reset()
          at Couchbase.IO.SyncState.CleanForReuse()
          at Couchbase.IO.MultiplexingConnection.ReleaseState(SyncState state)
          at Couchbase.IO.MultiplexingConnection.Send(Byte[] request)
          at Couchbase.IO.Services.PooledIOService.Execute[T](IOperation`1 operation)
          2018-09-05 22:23:34,370 [149] INFO Couchbase.Core.Server - Checking if node 172.23.120.17:11210 should be down - last: 22:23:19.7990363, current: 22:23:34.3700443, count: 2
          2018-09-05 22:23:34,370 [198] WARN Couchbase.IO.ConnectionBase - Handling disconnect for connection a5b25223-c28c-41d7-96a1-f75fb596b759: Couchbase.IO.RemoteHostClosedException: The remote host (172.23.120.17:11210) has gracefully closed this connection.
          2018-09-05 22:23:34,370 [198] INFO Couchbase.IO.ConnectionBase - Closing connection a5b25223-c28c-41d7-96a1-f75fb596b759
          2018-09-05 22:23:34,370 [149] DEBUG Couchbase.Configuration.Server.Monitoring.ConfigMonitor - Waiting to check configs...
          2018-09-05 22:23:34,370 [7] DEBUG Couchbase.IO.Services.PooledIOService - System.ObjectDisposedException: Safe handle has been closed
          at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success)
          at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success)
          at Microsoft.Win32.Win32Native.ResetEvent(SafeWaitHandle handle)
          at System.Threading.EventWaitHandle.Reset()
          at Couchbase.IO.SyncState.CleanForReuse()
          at Couchbase.IO.MultiplexingConnection.ReleaseState(SyncState state)
          at Couchbase.IO.MultiplexingConnection.Send(Byte[] request)
          at Couchbase.IO.Services.PooledIOService.Execute[T](IOperation`1 operation)
          2018-09-05 22:23:34,371 [25] INFO Sdkd.Commands.CommandBase - commandID:4ab0a4dd-b00d-4d55-b975-02117e151612:SENT
          2018-09-05 22:23:34,371 [7] INFO Couchbase.Core.Server - Checking if node 172.23.120.17:11210 should be down - last: 22:23:19.7970352, current: 22:23:34.3710396, count: 2
          2018-09-05 22:23:34,370 [17] INFO Couchbase.Core.Server - Checking if node 172.23.120.17:11210 should be down - last: 22:23:19.7990363, current: 22:23:34.3700443, count: 2
          2018-09-05 22:23:34,371 [7] DEBUG Couchbase.Configuration.Server.Monitoring.ConfigMonitor - Waiting to check configs...
          2018-09-05 22:23:34,371 [25] TRACE Couchbase.Configuration.ConfigContextBase - Getting KeyMapper for rev#51 on thread 25
          2018-09-05 22:23:34,371 [139] DEBUG Couchbase.IO.Services.PooledIOService - System.ObjectDisposedException: Safe handle has been closed
          at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success)
          at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success)
          at Microsoft.Win32.Win32Native.ResetEvent(SafeWaitHandle handle)
          at System.Threading.EventWaitHandle.Reset()
          at Couchbase.IO.SyncState.CleanForReuse()
          at Couchbase.IO.MultiplexingConnection.ReleaseState(SyncState state)
          at Couchbase.IO.MultiplexingConnection.Send(Byte[] request)
          at Couchbase.IO.Services.PooledIOService.Execute[T](IOperation`1 operation)

          4. This exception when around time rebalance completed (22:24:43 sdk log)
          at Couchbase.IO.MultiplexingConnection.Send(Byte[] request)
          at Couchbase.IO.Services.PooledIOService.Execute[T](IOperation`1 operation)
          2018-09-05 22:24:43,274 [76] INFO Couchbase.Core.Server - Checking if node 172.23.120.247:11210 should be down - last: 22:22:47.1250377, current: 22:24:43.2740366, count: 1
          2018-09-05 22:24:43,274 [76] DEBUG Couchbase.Configuration.Server.Monitoring.ConfigMonitor - Waiting to check configs...
          2018-09-05 22:24:43,275 [14] INFO Sdkd.Commands.CommandBase - commandID:fcf0afe4-f729-4e5f-b74d-e534dc39b124:SENT
          2018-09-05 22:24:43,275 [5] INFO Couchbase.IO.ConnectionBase - The operation has timed out. ["s":"kv","i":"2a3d8","c":"c90208621d9fa951/c70fd823d0e0bd39","b":"default","l":"172.23.109.123:49256","r":"172.23.120.247:11210","t":15000000]
          2018-09-05 22:24:43,275 [5] DEBUG Couchbase.IO.Services.PooledIOService - Couchbase.IO.SendTimeoutExpiredException: The operation has timed out. ["s":"kv","i":"2a3d8","c":"c90208621d9fa951/c70fd823d0e0bd39","b":"default","l":"172.23.109.123:49256","r":"172.23.120.247:11210","t":15000000]
          at Couchbase.IO.MultiplexingConnection.Send(Byte[] request)
          at Couchbase.IO.Services.PooledIOService.Execute[T](IOperation`1 operation)

          I'm not sure above is expected but wanted to share any suspicious logs.
          So, to summarize, when rebalance begins, the new node shows up in the log with exception and old node removed with exception. while rebalance and afterwards, response time increased from 0ms to 107 seconds.

          Hope this helps.

          Show
          jaekwon.park Jae Park [X] (Inactive) added a comment - attached the log with trace log level. To explain the log, According to jenkins time stamp, Rebalance begin : 22:23:56 (22:23:26 in SDK log) Rebalance complete : 22:25:13 (22:24:47 in SDK log) Added : 172.23.120.17 Removed : 172.23.120.247 Looks like jenkins time stamp is 30 seconds ahead of SDK log time stamp. In the log, Line 883396 seems the line when rebalance begins and got new config from the server Config changed (forced:False) new Rev#977 | old Rev#975 CCCP and another at line 884144 Config changed (forced:False) new Rev#977 | old Rev#970 CCCP another at line 884380 Config changed (forced:False) new Rev#977 | old Rev#960 CCCP another at line 884510 Config changed (forced:False) new Rev#977 | old Rev#972 CCCP another at line 884809 Config changed (forced:False) new Rev#977 | old Rev#970 CCCP <- this is same rev as line 884144 continues till 22:24:14 (SDK log time stamp) from 22:24:22(SDK log time stamp), upsert async response time gradually increase to 100 ms range then continues grow and at the end of the test, it grew to 107 seconds(107003 ms) Things I noticed are 1. node(172.23.120.17) to be added shows up in 22:23:34 (sdk log time stamp) which is about 12 seconds after rebalance begin. 2. new node shows up from the early rebalance process (does this mean the new node takes traffic from before rebalance complete?) 3. This exception starts while adding new node. 2018-09-05 22:23:34,276 [29] DEBUG Couchbase.IO.Services.PooledIOService - System.ObjectDisposedException: Safe handle has been closed at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success) at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success) at Microsoft.Win32.Win32Native.ResetEvent(SafeWaitHandle handle) at System.Threading.EventWaitHandle.Reset() at Couchbase.IO.SyncState.CleanForReuse() at Couchbase.IO.MultiplexingConnection.ReleaseState(SyncState state) at Couchbase.IO.MultiplexingConnection.Send(Byte[] request) at Couchbase.IO.Services.PooledIOService.Execute [T] (IOperation`1 operation) 2018-09-05 22:23:34,276 [29] INFO Couchbase.Core.Server - Checking if node 172.23.120.17:11210 should be down - last: 22:23:19.7150718, current: 22:23:34.2760371, count: 2 2018-09-05 22:23:34,276 [29] DEBUG Couchbase.Configuration.Server.Monitoring.ConfigMonitor - Waiting to check configs... 2018-09-05 22:23:34,276 [36] INFO Sdkd.Commands.CommandBase - commandID:cccfa555-cd0b-4d6b-b256-cfc9626cd6fc:SENT 2018-09-05 22:23:34,276 [16] DEBUG Couchbase.IO.Services.PooledIOService - System.ObjectDisposedException: Safe handle has been closed at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success) at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success) at Microsoft.Win32.Win32Native.ResetEvent(SafeWaitHandle handle) at System.Threading.EventWaitHandle.Reset() at Couchbase.IO.SyncState.CleanForReuse() at Couchbase.IO.MultiplexingConnection.ReleaseState(SyncState state) at Couchbase.IO.MultiplexingConnection.Send(Byte[] request) at Couchbase.IO.Services.PooledIOService.Execute [T] (IOperation`1 operation) 018-09-05 22:23:34,291 [29] DEBUG Couchbase.IO.Services.PooledIOService - System.ObjectDisposedException: Safe handle has been closed at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success) at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success) at Microsoft.Win32.Win32Native.ResetEvent(SafeWaitHandle handle) at System.Threading.EventWaitHandle.Reset() at Couchbase.IO.SyncState.CleanForReuse() at Couchbase.IO.MultiplexingConnection.ReleaseState(SyncState state) at Couchbase.IO.MultiplexingConnection.Send(Byte[] request) at Couchbase.IO.Services.PooledIOService.Execute [T] (IOperation`1 operation) 2018-09-05 22:23:34,323 [16] DEBUG Couchbase.IO.Services.PooledIOService - System.ObjectDisposedException: Safe handle has been closed at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success) at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success) at Microsoft.Win32.Win32Native.ResetEvent(SafeWaitHandle handle) at System.Threading.EventWaitHandle.Reset() at Couchbase.IO.SyncState.CleanForReuse() at Couchbase.IO.MultiplexingConnection.ReleaseState(SyncState state) at Couchbase.IO.MultiplexingConnection.Send(Byte[] request) at Couchbase.IO.Services.PooledIOService.Execute [T] (IOperation`1 operation) 2018-09-05 22:23:34,323 [16] INFO Couchbase.Core.Server - Checking if node 172.23.120.17:11210 should be down - last: 22:23:19.7680398, current: 22:23:34.3230393, count: 2 2018-09-05 22:23:34,323 [16] DEBUG Couchbase.Configuration.Server.Monitoring.ConfigMonitor - Waiting to check configs... 2018-09-05 22:23:34,323 [16] DEBUG Couchbase.Configuration.Server.Monitoring.ConfigMonitor - Checking configs... 2018-09-05 22:23:34,323 [188] WARN Couchbase.IO.ConnectionBase - Handling disconnect for connection 0dd7216e-323e-4376-90bc-a0ac4d5505ce: Couchbase.IO.RemoteHostClosedException: The remote host (172.23.120.17:11210) has gracefully closed this connection. 2018-09-05 22:23:34,323 [16] DEBUG Couchbase.Configuration.Server.Monitoring.ConfigMonitor - Using index 3 - server Couchbase.Core.Server 2018-09-05 22:23:34,323 [188] INFO Couchbase.IO.ConnectionBase - Closing connection 0dd7216e-323e-4376-90bc-a0ac4d5505ce 2018-09-05 22:23:34,323 [16] DEBUG Couchbase.Core.Server - Sending Config with key using server 172.23.120.17:11210 2018-09-05 22:23:34,323 [16] DEBUG Couchbase.IO.SharedConnectionPool`1[ [Couchbase.IO.MultiplexingConnection, Couchbase.NetClient, Version=2.5.7.0, Culture=neutral, PublicKeyToken=null] ] - Trying to acquire new connection! Refs=1 2018-09-05 22:23:34,323 [29] DEBUG Couchbase.IO.Services.PooledIOService - System.ObjectDisposedException: Safe handle has been closed at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success) at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success) at Microsoft.Win32.Win32Native.ResetEvent(SafeWaitHandle handle) at System.Threading.EventWaitHandle.Reset() at Couchbase.IO.SyncState.CleanForReuse() at Couchbase.IO.MultiplexingConnection.ReleaseState(SyncState state) at Couchbase.IO.MultiplexingConnection.Send(Byte[] request) at Couchbase.IO.Services.PooledIOService.Execute [T] (IOperation`1 operation) 2018-09-05 22:23:34,370 [17] DEBUG Couchbase.IO.Services.PooledIOService - System.ObjectDisposedException: Safe handle has been closed at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success) at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success) at Microsoft.Win32.Win32Native.ResetEvent(SafeWaitHandle handle) at System.Threading.EventWaitHandle.Reset() at Couchbase.IO.SyncState.CleanForReuse() at Couchbase.IO.MultiplexingConnection.ReleaseState(SyncState state) at Couchbase.IO.MultiplexingConnection.Send(Byte[] request) at Couchbase.IO.Services.PooledIOService.Execute [T] (IOperation`1 operation) 2018-09-05 22:23:34,370 [149] DEBUG Couchbase.IO.Services.PooledIOService - System.ObjectDisposedException: Safe handle has been closed at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success) at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success) at Microsoft.Win32.Win32Native.ResetEvent(SafeWaitHandle handle) at System.Threading.EventWaitHandle.Reset() at Couchbase.IO.SyncState.CleanForReuse() at Couchbase.IO.MultiplexingConnection.ReleaseState(SyncState state) at Couchbase.IO.MultiplexingConnection.Send(Byte[] request) at Couchbase.IO.Services.PooledIOService.Execute [T] (IOperation`1 operation) 2018-09-05 22:23:34,370 [149] INFO Couchbase.Core.Server - Checking if node 172.23.120.17:11210 should be down - last: 22:23:19.7990363, current: 22:23:34.3700443, count: 2 2018-09-05 22:23:34,370 [198] WARN Couchbase.IO.ConnectionBase - Handling disconnect for connection a5b25223-c28c-41d7-96a1-f75fb596b759: Couchbase.IO.RemoteHostClosedException: The remote host (172.23.120.17:11210) has gracefully closed this connection. 2018-09-05 22:23:34,370 [198] INFO Couchbase.IO.ConnectionBase - Closing connection a5b25223-c28c-41d7-96a1-f75fb596b759 2018-09-05 22:23:34,370 [149] DEBUG Couchbase.Configuration.Server.Monitoring.ConfigMonitor - Waiting to check configs... 2018-09-05 22:23:34,370 [7] DEBUG Couchbase.IO.Services.PooledIOService - System.ObjectDisposedException: Safe handle has been closed at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success) at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success) at Microsoft.Win32.Win32Native.ResetEvent(SafeWaitHandle handle) at System.Threading.EventWaitHandle.Reset() at Couchbase.IO.SyncState.CleanForReuse() at Couchbase.IO.MultiplexingConnection.ReleaseState(SyncState state) at Couchbase.IO.MultiplexingConnection.Send(Byte[] request) at Couchbase.IO.Services.PooledIOService.Execute [T] (IOperation`1 operation) 2018-09-05 22:23:34,371 [25] INFO Sdkd.Commands.CommandBase - commandID:4ab0a4dd-b00d-4d55-b975-02117e151612:SENT 2018-09-05 22:23:34,371 [7] INFO Couchbase.Core.Server - Checking if node 172.23.120.17:11210 should be down - last: 22:23:19.7970352, current: 22:23:34.3710396, count: 2 2018-09-05 22:23:34,370 [17] INFO Couchbase.Core.Server - Checking if node 172.23.120.17:11210 should be down - last: 22:23:19.7990363, current: 22:23:34.3700443, count: 2 2018-09-05 22:23:34,371 [7] DEBUG Couchbase.Configuration.Server.Monitoring.ConfigMonitor - Waiting to check configs... 2018-09-05 22:23:34,371 [25] TRACE Couchbase.Configuration.ConfigContextBase - Getting KeyMapper for rev#51 on thread 25 2018-09-05 22:23:34,371 [139] DEBUG Couchbase.IO.Services.PooledIOService - System.ObjectDisposedException: Safe handle has been closed at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success) at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success) at Microsoft.Win32.Win32Native.ResetEvent(SafeWaitHandle handle) at System.Threading.EventWaitHandle.Reset() at Couchbase.IO.SyncState.CleanForReuse() at Couchbase.IO.MultiplexingConnection.ReleaseState(SyncState state) at Couchbase.IO.MultiplexingConnection.Send(Byte[] request) at Couchbase.IO.Services.PooledIOService.Execute [T] (IOperation`1 operation) 4. This exception when around time rebalance completed (22:24:43 sdk log) at Couchbase.IO.MultiplexingConnection.Send(Byte[] request) at Couchbase.IO.Services.PooledIOService.Execute [T] (IOperation`1 operation) 2018-09-05 22:24:43,274 [76] INFO Couchbase.Core.Server - Checking if node 172.23.120.247:11210 should be down - last: 22:22:47.1250377, current: 22:24:43.2740366, count: 1 2018-09-05 22:24:43,274 [76] DEBUG Couchbase.Configuration.Server.Monitoring.ConfigMonitor - Waiting to check configs... 2018-09-05 22:24:43,275 [14] INFO Sdkd.Commands.CommandBase - commandID:fcf0afe4-f729-4e5f-b74d-e534dc39b124:SENT 2018-09-05 22:24:43,275 [5] INFO Couchbase.IO.ConnectionBase - The operation has timed out. ["s":"kv","i":"2a3d8","c":"c90208621d9fa951/c70fd823d0e0bd39","b":"default","l":"172.23.109.123:49256","r":"172.23.120.247:11210","t":15000000] 2018-09-05 22:24:43,275 [5] DEBUG Couchbase.IO.Services.PooledIOService - Couchbase.IO.SendTimeoutExpiredException: The operation has timed out. ["s":"kv","i":"2a3d8","c":"c90208621d9fa951/c70fd823d0e0bd39","b":"default","l":"172.23.109.123:49256","r":"172.23.120.247:11210","t":15000000] at Couchbase.IO.MultiplexingConnection.Send(Byte[] request) at Couchbase.IO.Services.PooledIOService.Execute [T] (IOperation`1 operation) I'm not sure above is expected but wanted to share any suspicious logs. So, to summarize, when rebalance begins, the new node shows up in the log with exception and old node removed with exception. while rebalance and afterwards, response time increased from 0ms to 107 seconds. Hope this helps.
          Hide
          jaekwon.park Jae Park [X] (Inactive) added a comment -

          verified in current top of the master 3c06352892dfbc448c7a38a45ae86cacf0448d6c

          Show
          jaekwon.park Jae Park [X] (Inactive) added a comment - verified in current top of the master 3c06352892dfbc448c7a38a45ae86cacf0448d6c

            People

            • Assignee:
              mike.goldsmith Michael Goldsmith
              Reporter:
              jaekwon.park Jae Park [X] (Inactive)
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Gerrit Reviews

                There are no open Gerrit changes

                  PagerDuty

                  Error rendering 'com.pagerduty.jira-server-plugin:PagerDuty'. Please contact your Jira administrators.