Details
Description
Application sets up 2 connections to a 3-node CB cluster, using lcb_create() (libevent2). Every 3 seconds (after the connection is bootstrapped), lcb_ping3() is issued and the callback processed.
When one the couchbase-server on one of the nodes stopped (no failover performed), the ping response correctly shows an error status:
Store-con-0 | INFO | service: KV, status: 0, host: 192.168.91.84:11210, latency: 382559 nanoseconds in ping_callback@store_couchbase.c:3304
|
Store-con-0 | INFO | service: KV, status: 0, host: 192.168.91.211:11210, latency: 1024532 nanoseconds in ping_callback@store_couchbase.c:3304
|
Store-con-0 | INFO | service: KV, status: 44, host: 192.168.91.143:11210, latency: 1039704 nanoseconds in ping_callback@store_couchbase.c:3304
|
Now when the application is stopped, lcb_destroy() quite often results in a SEGV:
Program received signal SIGSEGV, Segmentation fault.
|
[Switching to Thread 0x7fff9d97a700 (LWP 3429)]
|
0x00007ffff7d4a3b5 in mcreq_allocate_packet (pipeline=0xab2170) at /usr/src/debug/libcouchbase-2.7.7/src/mc/mcreq.c:277
|
277 ret->opaque = pipeline->parent->seq++;
|
(gdb) where
|
#0 0x00007ffff7d4a3b5 in mcreq_allocate_packet (pipeline=0xab2170) at /usr/src/debug/libcouchbase-2.7.7/src/mc/mcreq.c:277
|
#1 0x00007ffff7d6c9b9 in lcb_st::request_config (this=0x912a60, cookie_=0x7fff940008c0, server=0xab2170)
|
at /usr/src/debug/libcouchbase-2.7.7/src/getconfig.cc:46
|
#2 0x00007ffff7d6144e in CccpProvider::schedule_next_request (this=0x9132c0, err=<value optimized out>,
|
can_rollover=<value optimized out>) at /usr/src/debug/libcouchbase-2.7.7/src/bucketconfig/bc_cccp.cc:145
|
#3 0x00007ffff7d6185b in lcb::clconfig::cccp_update (cookie_=<value optimized out>, err=LCB_ERROR, bytes=0x0, nbytes=0, origin=0x0)
|
at /usr/src/debug/libcouchbase-2.7.7/src/bucketconfig/bc_cccp.cc:234
|
#4 0x00007ffff7d6c944 in ext_callback_proxy (pl=<value optimized out>, req=<value optimized out>, rc=LCB_ERROR,
|
resdata=<value optimized out>) at /usr/src/debug/libcouchbase-2.7.7/src/getconfig.cc:33
|
#5 0x00007ffff7d6df34 in H_config (pipeline=0x7fff9d979a00, req=0x7fff94001800, res=0x7fff9d979c30, immerr=LCB_ERROR)
|
at /usr/src/debug/libcouchbase-2.7.7/src/handler.cc:872
|
#6 mcreq_dispatch_response (pipeline=0x7fff9d979a00, req=0x7fff94001800, res=0x7fff9d979c30, immerr=LCB_ERROR)
|
at /usr/src/debug/libcouchbase-2.7.7/src/handler.cc:980
|
#7 0x00007ffff7d91497 in lcb::RetryQueue::fail (this=0x9139f0, op=0x7fff94000930, err=LCB_ERROR)
|
at /usr/src/debug/libcouchbase-2.7.7/src/retryq.cc:164
|
#8 0x00007ffff7d9155e in lcb::RetryQueue::~RetryQueue (this=0x9139f0, __in_chrg=<value optimized out>)
|
at /usr/src/debug/libcouchbase-2.7.7/src/retryq.cc:403
|
#9 0x00007ffff7d5cbb2 in lcb_destroy (instance=0x912a60) at /usr/src/debug/libcouchbase-2.7.7/src/instance.cc:557
|
#10 0x00000000004b5662 in con_thr (arg=0x909ee8) at src/store_couchbase.c:3658
|
#11 0x0000003a3ea07aa1 in start_thread () from /lib64/libpthread.so.0
|
#12 0x0000003a3e2e8aad in clone () from /lib64/libc.so.6}}{{(gdb) p pipeline
|
$2 = (mc_PIPELINE *) 0xab2170
|
(gdb) p pipeline->parent
|
$3 = (struct mc_cmdqueue_st *) 0x0
|
Apparently there is a bucket config request pending in the retry queue, which is attempted during the destroy. I would assume that no requests should be (re)tried while closing the connection, but these should be discarded instead.
I'm not sure whether this is triggered by the (new) lcb_ping3(), or that it is just luck that I'm running into this now when trying the 2.7.7 version.