Uploaded image for project: 'Couchbase C client library libcouchbase'
  1. Couchbase C client library libcouchbase
  2. CCBC-104

Client crashes when entrypoint node connection is dropped

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 2.0.0beta
    • Fix Version/s: 2.0.0beta3
    • Component/s: library
    • Security Level: Public
    • Labels:
      None
    • Environment:
      Entry point node is the REST node passed to the constructor

      Description

      'DelayMin': 1, 'DelayMax': 10}}
      [driver driver.py:36] > MC_DS_MUTATE_SET:40 @19 {'DSType': 'DSTYPE_SEEDED', 'DS':

      {'Count': 1000, 'Repeat': '_REP_', 'VSeed': 'SimpleValue_', 'KSize': 100, 'Continuous': True, 'VSize': 12, 'KSeed': 'SimpleKey_'}

      , 'Options': {'TimeRes': 1, 'IterWait': True, 'DelayMin': 1, 'DelayMax': 10}}
      [driver failover.py:134] Ramp for 3 seconds..
      [driver failover.py:141] No service specified..
      [driver rest_client.py:710] fail_over successful
      sdkd_lcb: src/instance.c:495: relocate_packets: Assertion `ringbuffer_write(&dst->cmd_log, cmd.bytes, sizeof(cmd.bytes)) == sizeof(cmd.bytes)' failed.

      Program received signal SIGABRT, Aborted.
      [Switching to Thread 0x7ffff21a8700 (LWP 5726)]
      0x00007ffff6e7d475 in *__GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
      64 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
      (gdb) bt
      #0 0x00007ffff6e7d475 in *__GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
      #1 0x00007ffff6e806f0 in *__GI_abort () at abort.c:92
      #2 0x00007ffff6e76621 in *_GI__assert_fail (
      assertion=0x7ffff6a20478 "ringbuffer_write(&dst->cmd_log, cmd.bytes, sizeof(cmd.bytes)) == sizeof(cmd.bytes)",
      file=<optimized out>, line=495, function=0x7ffff6a20790 "relocate_packets") at assert.c:81
      #3 0x00007ffff6a10724 in vbucket_stream_handler () from /home/mnunberg/src/sdkd-cpp/inst/2.0.0beta//lib/libcouchbase.so.2
      #4 0x00007ffff5bc90a4 in event_process_active_single_queue (activeq=0x635250, base=0x63c510) at event.c:1346
      #5 event_process_active (base=<optimized out>) at event.c:1416
      #6 event_base_loop (base=0x63c510, flags=0) at event.c:1617
      #7 0x00007ffff7bbc922 in CBSdkd::Handle::postsubmit (this=0x63c810, rs=..., nsubmit=1) at Handle.cpp:206
      #8 0x00007ffff7bbceeb in CBSdkd::Handle::dsMutate (this=0x63c810, cmd=..., ds=..., out=..., options=...) at Handle.cpp:300
      #9 0x00007ffff7bc8ab1 in CBSdkd::WorkerDispatch::_process_request (this=0x63c230, req=..., rs=0x63c420) at IODispatch.cpp:494
      #10 0x00007ffff7bc93d5 in CBSdkd::WorkerDispatch::run (this=0x63c230) at IODispatch.cpp:572
      #11 0x00007ffff7bc7276 in CBSdkd::new_worker_thread (worker=0x63c230) at IODispatch.cpp:227
      #12 0x00007ffff71d8b50 in start_thread (arg=<optimized out>) at pthread_create.c:304
      #13 0x00007ffff6f2370d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
      #14 0x0000000000000000 in ?? ()
      (gdb) [driver rest_client.py:710] fail_over successful
      [driver failover.py:149] Sleeping for 3 seconds after failover
      [driver driver.py:36] > CANCEL:41 @2 {}

      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        ingenthr Matt Ingenthron added a comment -

        Scenario to be concerned with is:
        client is working with the cluster, bootstrapping off of node A of a cluster of three (A, B, C)
        client is listening for config changes from node A
        trond walks by and hits the reset button on node A (because he's mean)
        no TCP RST
        operations for node A keep timing out
        even if autofailover is enabled, and the config updates
        and node B and C takeover
        the client will never get an update and never reconfigure

        Show
        ingenthr Matt Ingenthron added a comment - Scenario to be concerned with is: client is working with the cluster, bootstrapping off of node A of a cluster of three (A, B, C) client is listening for config changes from node A trond walks by and hits the reset button on node A (because he's mean) no TCP RST operations for node A keep timing out even if autofailover is enabled, and the config updates and node B and C takeover the client will never get an update and never reconfigure
        Hide
        avsej Sergey Avseyev added a comment -

        This patch implements issue counting for paired memcached socket, and once number of failures exceed threshold – it reconnect config connection

        http://review.couchbase.org/22686

        Show
        avsej Sergey Avseyev added a comment - This patch implements issue counting for paired memcached socket, and once number of failures exceed threshold – it reconnect config connection http://review.couchbase.org/22686
        Hide
        avsej Sergey Avseyev added a comment -

        Need verification

        Show
        avsej Sergey Avseyev added a comment - Need verification
        Hide
        avsej Sergey Avseyev added a comment -

        http://review.couchbase.org/22691

        The patch has been merged already, but requires verification on the test installation

        Show
        avsej Sergey Avseyev added a comment - http://review.couchbase.org/22691 The patch has been merged already, but requires verification on the test installation
        Hide
        mnunberg Mark Nunberg added a comment -

        Fixed and verified in latest commit

        Show
        mnunberg Mark Nunberg added a comment - Fixed and verified in latest commit

          People

          • Assignee:
            mnunberg Mark Nunberg
            Reporter:
            mnunberg Mark Nunberg
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes