Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-4779

moxi on Debian Squeeze 6.0.4 64-bit crashes with segmentaion fault when run as standalone server against 3 node cluster

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.8.0
    • Fix Version/s: 1.8.1
    • Component/s: moxi
    • Security Level: Public
    • Labels:
      None
    • Environment:
      cat /etc/issue
      Debian GNU/Linux 6.0 \n \l

      cat /etc/debian_version
      6.0.4

      Squeeze.

      Description

      >
      > (snipped)
      >
      >More info on the server (all three are identical)
      >
      >Linux membase01 2.6.32-5-amd64 #1 SMP Mon Jan 16 16:22:28 UTC 2012 x86_64
      >GNU/Linux
      >
      >Architecture: x86_64
      >CPU op-mode(s): 32-bit, 64-bit
      >CPU(s): 8
      >Thread(s) per core: 1
      >Core(s) per socket: 4
      >CPU socket(s): 2
      >NUMA node(s): 2
      >Vendor ID: GenuineIntel
      >CPU family: 6
      >Model: 44
      >Stepping: 2
      >CPU MHz: 2133.304
      >Virtualization: VT-x
      >L1d cache: 32K
      >L1i cache: 32K
      >L2 cache: 256K
      >L3 cache: 8192K
      >
      > total used free shared buffers cached
      >Mem: 70239000 33236456 37002544 0 161252 15946920
      >-/+ buffers/cache: 17128284 53110716
      >Swap: 5846008 0 5846008
      >
      >eth0 Link encap:Ethernet HWaddr 00:26:55:ec:5e:11
      > inet addr:172.17.18.51 Bcast:172.17.19.255 Mask:255.255.252.0
      > inet6 addr: fe80::226:55ff:feec:5e11/64 Scope:Link
      > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
      > RX packets:4677086580 errors:0 dropped:0 overruns:0 frame:0
      > TX packets:3707971052 errors:0 dropped:0 overruns:0 carrier:0
      > collisions:0 txqueuelen:1000
      > RX bytes:2427184337913 (2.2 TiB) TX bytes:1814771064616 (1.6
      >TiB)
      > Interrupt:34 Memory:fbfa0000-fbfc0000
      >
      >/opt/couchbase/bin/cbstats localhost:11210 all
      > accepting_conns: 1
      > auth_cmds: 356
      > auth_errors: 0
      > bucket_active_conns: 1
      > bucket_conns: 32
      > bytes_read: 1031538977211
      > bytes_written: 22886757597
      > cas_badval: 0
      > cas_hits: 0
      > cas_misses: 0
      > cmd_flush: 0
      > cmd_get: 1757
      > cmd_set: 343892374
      > conn_yields: 4046021
      > connection_structures: 45
      > curr_connections: 40
      > curr_items: 12758375
      > curr_items_tot: 25509661
      > daemon_connections: 10
      > decr_hits: 0
      > decr_misses: 0
      > delete_hits: 1757
      > delete_misses: 0
      > ep_bg_fetched: 0
      > ep_commit_num: 246390
      > ep_commit_time: 2
      > ep_commit_time_total: 428308
      > ep_data_age: 577
      > ep_data_age_highwat: 2148
      > ep_db_cleaner_status: complete
      > ep_db_strategy: multiMTVBDB
      > ep_dbinit: 0
      > ep_dbname:
      >/opt/couchbase/var/lib/couchbase/data/default-data/default
      > ep_dbshards: 4
      > ep_diskqueue_drain: 303580718
      > ep_diskqueue_fill: 302265045
      > ep_diskqueue_items: 311173
      > ep_diskqueue_memory: 24893840
      > ep_diskqueue_pending: 568614052
      > ep_exp_pager_stime: 3600
      > ep_expired: 37140925
      > ep_flush_all: false
      > ep_flush_duration: 575
      > ep_flush_duration_highwat: 1402
      > ep_flush_duration_total: 433538
      > ep_flush_preempts: 0
      > ep_flusher_state: running
      > ep_flusher_todo: 66150
      > ep_inconsistent_slave_chk: 0
      > ep_io_num_read: 684
      > ep_io_num_write: 206511175
      > ep_io_read_bytes: 0
      > ep_io_write_bytes: 188356635638
      > ep_item_begin_failed: 0
      > ep_item_commit_failed: 0
      > ep_item_flush_expired: 37336561
      > ep_item_flush_failed: 0
      > ep_items_rm_from_checkpoints: 209669074
      > ep_keep_closed_checkpoints: 0
      > ep_kv_size: 14294968796
      > ep_latency_arith_cmd: 0
      > ep_latency_get_cmd: 1757
      > ep_latency_store_cmd: 343892374
      > ep_max_data_size: 67108864000
      > ep_max_txn_size: 1000
      > ep_mem_high_wat: 50331648000
      > ep_mem_low_wat: 40265318400
      > ep_min_data_age: 0
      > ep_num_active_non_resident: 0
      > ep_num_checkpoint_remover_runs: 103648
      > ep_num_eject_failures: 0
      > ep_num_eject_replicas: 0
      > ep_num_expiry_pager_runs: 143
      > ep_num_non_resident: 0
      > ep_num_not_my_vbuckets: 0
      > ep_num_pager_runs: 0
      > ep_num_value_ejects: 0
      > ep_onlineupdate: false
      > ep_onlineupdate_revert_add: 0
      > ep_onlineupdate_revert_delete: 0
      > ep_onlineupdate_revert_update: 0
      > ep_oom_errors: 0
      > ep_overhead: 209074728
      > ep_pending_ops: 0
      > ep_pending_ops_max: 0
      > ep_pending_ops_max_duration: 0
      > ep_pending_ops_total: 0
      > ep_queue_age_cap: 900
      > ep_queue_size: 295580
      > ep_storage_age: 583
      > ep_storage_age_highwat: 2504
      > ep_storage_type: featured
      > ep_store_max_concurrency: 10
      > ep_store_max_readers: 9
      > ep_store_max_readwrite: 1
      > ep_tap_bg_fetch_requeued: 0
      > ep_tap_bg_fetched: 0
      > ep_tap_keepalive: 300
      > ep_tmp_oom_errors: 0
      > ep_too_old: 4072527
      > ep_too_young: 0
      > ep_total_cache_size: 26329299003
      > ep_total_del_items: 37105176
      > ep_total_enqueued: 302265045
      > ep_total_new_items: 62519579
      > ep_total_persisted: 243616351
      > ep_uncommitted_items: 983
      > ep_value_size: 11464506014
      > ep_vb_total: 684
      > ep_vbucket_del: 0
      > ep_vbucket_del_fail: 0
      > ep_version: 1.8.0r_78_g3539559
      > ep_warmed_up: 0
      > ep_warmup: true
      > ep_warmup_dups: 0
      > ep_warmup_oom: 0
      > ep_warmup_thread: complete
      > ep_warmup_time: 31940
      > get_hits: 1757
      > get_misses: 0
      > incr_hits: 0
      > incr_misses: 0
      > libevent: 2.0.11-stable
      > limit_maxbytes: 67108864
      > listen_disabled_num: 0
      > mem_used: 14504043524
      > pid: 1450
      > pointer_size: 64
      > rejected_conns: 0
      > rusage_system: 32743.014310
      > rusage_user: 46729.776428
      > tap_checkpoint_end_received: 248292
      > tap_checkpoint_end_sent: 248134
      > tap_checkpoint_start_received: 248634
      > tap_checkpoint_start_sent: 248476
      > tap_connect_received: 17
      > tap_delete_received: 18789732
      > tap_delete_sent: 18771682
      > tap_mutation_received: 302338252
      > tap_mutation_sent: 303796706
      > tap_opaque_received: 25
      > tap_opaque_sent: 34
      > threads: 4
      > time: 1328656452
      > total_connections: 374
      > uptime: 518292
      > vb_active_curr_items: 12758375
      > vb_active_eject: 0
      > vb_active_ht_memory: 67376736
      > vb_active_itm_memory: 6964074745
      > vb_active_num: 342
      > vb_active_num_non_resident: 0
      > vb_active_ops_create: 31303843
      > vb_active_ops_delete: 18582985
      > vb_active_ops_reject: 0
      > vb_active_ops_update: 76091716
      > vb_active_perc_mem_resident: 100
      > vb_active_queue_age: 66481591000
      > vb_active_queue_drain: 150533739
      > vb_active_queue_fill: 150703438
      > vb_active_queue_memory: 13575920
      > vb_active_queue_pending: 384162544
      > vb_active_queue_size: 169699
      > vb_dead_num: 0
      > vb_pending_curr_items: 0
      > vb_pending_eject: 0
      > vb_pending_ht_memory: 0
      > vb_pending_itm_memory: 0
      > vb_pending_num: 0
      > vb_pending_num_non_resident: 0
      > vb_pending_ops_create: 0
      > vb_pending_ops_delete: 0
      > vb_pending_ops_reject: 0
      > vb_pending_ops_update: 0
      > vb_pending_perc_mem_resident: 0
      > vb_pending_queue_age: 0
      > vb_pending_queue_drain: 0
      > vb_pending_queue_fill: 0
      > vb_pending_queue_memory: 0
      > vb_pending_queue_pending: 0
      > vb_pending_queue_size: 0
      > vb_replica_curr_items: 12751286
      > vb_replica_eject: 0
      > vb_replica_ht_memory: 67376736
      > vb_replica_itm_memory: 6946620426
      > vb_replica_num: 342
      > vb_replica_num_non_resident: 0
      > vb_replica_ops_create: 31215736
      > vb_replica_ops_delete: 18522191
      > vb_replica_ops_reject: 0
      > vb_replica_ops_update: 67899880
      > vb_replica_perc_mem_resident: 100
      > vb_replica_queue_age: 18446743612500506616
      > vb_replica_queue_drain: 153046979
      > vb_replica_queue_fill: 151561607
      > vb_replica_queue_memory: 11317920
      > vb_replica_queue_pending: 184451508
      > vb_replica_queue_size: 141474
      > version: UNKNOWN
      >

      1. core.bz2
        2.67 MB
        Matt Ingenthron
      2. moxi-log-snippet.log.gz
        50 kB
        Farshid Ghods
      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        steve Steve Yen added a comment - - edited

        >>Here is the stack strace. Let me know if you need anything else and
        >>thanks for responding!
        >>
        >>GNU gdb (GDB) 7.0.1-debian
        >>Copyright (C) 2009 Free Software Foundation, Inc.
        >>License GPLv3+: GNU GPL version 3 or later
        >><http://gnu.org/licenses/gpl.html>
        >>This is free software: you are free to change and redistribute it.
        >>There is NO WARRANTY, to the extent permitted by law. Type "show
        >>copying"
        >>and "show warranty" for details.
        >>This GDB was configured as "x86_64-linux-gnu".
        >>For bug reporting instructions, please see:
        >><http://www.gnu.org/software/gdb/bugs/>...
        >>Reading symbols from /opt/couchbase/bin/moxi...done.
        >>[New Thread 17504]
        >>[New Thread 17508]
        >>[New Thread 17503]
        >>[New Thread 17506]
        >>[New Thread 17509]
        >>[New Thread 17507]
        >>[New Thread 17510]
        >>
        >>warning: Can't read pathname for load map: Input/output error.
        >>Reading symbols from /usr/lib/libz.so.1...(no debugging symbols
        >>found)...done.
        >>Loaded symbols for /usr/lib/libz.so.1
        >>Reading symbols from /lib/librt.so.1...(no debugging symbols
        >>found)...done.
        >>Loaded symbols for /lib/librt.so.1
        >>Reading symbols from /lib/libm.so.6...(no debugging symbols
        >>found)...done.
        >>Loaded symbols for /lib/libm.so.6
        >>Reading symbols from /lib/libpthread.so.0...(no debugging symbols
        >>found)...done.
        >>Loaded symbols for /lib/libpthread.so.0
        >>Reading symbols from /lib/libc.so.6...(no debugging symbols
        >>found)...done.
        >>Loaded symbols for /lib/libc.so.6
        >>Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols
        >>found)...done.
        >>Loaded symbols for /lib64/ld-linux-x86-64.so.2
        >>Core was generated by `/opt/couchbase/bin/moxi -Z
        >>port_listen=11211,default_bucket_name=default,downst'.
        >>Program terminated with signal 11, Segmentation fault.
        >>#0 0x0000000000000000 in ?? ()
        >>(gdb) thread apply all bt full
        >>
        >>Thread 8 (Thread 17510):
        >>#0 0x00007f379f53058d in read () from /lib/libc.so.6
        >>No symbol table info available.
        >>#1 0x00007f379f4daeb8 in _IO_file_underflow () from /lib/libc.so.6
        >>No symbol table info available.
        >>#2 0x00007f379f4dc59e in _IO_default_uflow () from /lib/libc.so.6
        >>No symbol table info available.
        >>#3 0x00007f379f4d3a7b in getc () from /lib/libc.so.6
        >>No symbol table info available.
        >>#4 0x0000000000430fc4 in check_stdin_thread (arg=<value optimized out>)
        >>at stdin_check.c:18
        >> ch = -512
        >>#5 0x00007f379f7d58ca in start_thread () from /lib/libpthread.so.0
        >>No symbol table info available.
        >>#6 0x00007f379f53c86d in clone () from /lib/libc.so.6
        >>No symbol table info available.
        >>#7 0x0000000000000000 in ?? ()
        >>No symbol table info available.
        >>
        >>Thread 7 (Thread 17507):
        >>#0 0x00007f379f53ce63 in epoll_wait () from /lib/libc.so.6
        >>No symbol table info available.
        >>#1 0x000000000044a972 in epoll_dispatch (base=0x1e092d0, tv=<value
        >>optimized out>) at epoll.c:404
        >> epollop = 0x1e09550
        >> events = 0x216f550
        >> i = <value optimized out>
        >> res = <value optimized out>
        >> timeout = <value optimized out>
        >> _func_ = "epoll_dispatch"
        >>#2 0x0000000000444bef in event_base_loop (base=0x1e092d0, flags=<value
        >>optimized out>) at event.c:1558
        >> evsel = 0x49f200
        >> tv =

        {tv_sec = 0, tv_usec = 40644}

        >> tv_p = 0x7f379dc68e80
        >> res = <value optimized out>
        >> retval = <value optimized out>
        >> _func_ = "event_base_loop"
        >>#3 0x0000000000414b15 in worker_libevent (arg=<value optimized out>) at
        >>thread.c:272
        >> me = 0x1df5250
        >>#4 0x00007f379f7d58ca in start_thread () from /lib/libpthread.so.0
        >>No symbol table info available.
        >>#5 0x00007f379f53c86d in clone () from /lib/libc.so.6
        >>No symbol table info available.
        >>#6 0x0000000000000000 in ?? ()
        >>
        >>No symbol table info available.
        >>
        >>Thread 6 (Thread 17509):
        >>#0 0x00007f379f531913 in poll () from /lib/libc.so.6
        >>No symbol table info available.
        >>#1 0x0000000000452ed6 in Curl_socket_ready ()
        >>No symbol table info available.
        >>#2 0x0000000000469f49 in Curl_do_perform ()
        >>No symbol table info available.
        >>#3 0x000000000044c8b1 in run_rest_conflate (arg=0x1e15670) at rest.c:293
        >> urls = 0x1e0e2b0
        >>"http://127.0.0.1:8091/pools/default/saslBucketsStreaming"
        >> next = 0x0
        >> userpass = <value optimized out>
        >> curl_error_string = '\000' <repeats 255 times>
        >> conf = <value optimized out>
        >> c = <value optimized out>
        >> _PRETTY_FUNCTION_ = "run_rest_conflate"
        >> curl_handle = <value optimized out>
        >>#4 0x00007f379f7d58ca in start_thread () from /lib/libpthread.so.0
        >>No symbol table info available.
        >>#5 0x00007f379f53c86d in clone () from /lib/libc.so.6
        >>No symbol table info available.
        >>#6 0x0000000000000000 in ?? ()
        >>No symbol table info available.
        >>
        >>Thread 5 (Thread 17506):
        >>#0 0x00007f379f53ce63 in epoll_wait () from /lib/libc.so.6
        >>No symbol table info available.
        >>#1 0x000000000044a972 in epoll_dispatch (base=0x1e04be0, tv=<value
        >>optimized out>) at epoll.c:404
        >> epollop = 0x1e04e60
        >> events = 0x1e04e80
        >> i = <value optimized out>
        >> res = <value optimized out>
        >> timeout = <value optimized out>
        >> _func_ = "epoll_dispatch"
        >>#2 0x0000000000444bef in event_base_loop (base=0x1e04be0, flags=<value
        >>optimized out>) at event.c:1558
        >> evsel = 0x49f200
        >> tv =

        {tv_sec = 0, tv_usec = 130286}

        >> tv_p = 0x7f379e469e80
        >> res = <value optimized out>
        >> retval = <value optimized out>
        >> _func_ = "event_base_loop"
        >>#3 0x0000000000414b15 in worker_libevent (arg=<value optimized out>) at
        >>thread.c:272
        >>
        >> me = 0x1df2520
        >>#4 0x00007f379f7d58ca in start_thread () from /lib/libpthread.so.0
        >>No symbol table info available.
        >>#5 0x00007f379f53c86d in clone () from /lib/libc.so.6
        >>No symbol table info available.
        >>#6 0x0000000000000000 in ?? ()
        >>No symbol table info available.
        >>
        >>Thread 4 (Thread 17503):
        >>#0 0x00007f379f53ce63 in epoll_wait () from /lib/libc.so.6
        >>No symbol table info available.
        >>#1 0x000000000044a972 in epoll_dispatch (base=0x1de91c0, tv=<value
        >>optimized out>) at epoll.c:404
        >> epollop = 0x1de9440
        >> events = 0x1de9460
        >> i = <value optimized out>
        >> res = <value optimized out>
        >> timeout = <value optimized out>
        >> _func_ = "epoll_dispatch"
        >>#2 0x0000000000444bef in event_base_loop (base=0x1de91c0, flags=<value
        >>optimized out>) at event.c:1558
        >> evsel = 0x49f200
        >> tv =

        {tv_sec = 0, tv_usec = 127827}

        >> tv_p = 0x7fffd745a910
        >> res = <value optimized out>
        >> retval = <value optimized out>
        >> _func_ = "event_base_loop"
        >>#3 0x000000000040e295 in main (argc=12, argv=0x7fffd745baf8) at
        >>memcached.c:5061
        >> c = <value optimized out>
        >> lock_memory = false
        >> do_daemonize = false
        >> check_stdin = true
        >> maxcore = 0
        >> username = 0x0
        >> pid_file = 0x0
        >> cproxy_cfg = 0x1de9130 "a-certificates.crt"
        >> cproxy_behavior = 0x1de9010 "\r\n"
        >> pw = <value optimized out>
        >> log_file = 0x0
        >> log_mode = <value optimized out>
        >> rlim =

        {rlim_cur = 10240, rlim_max = 10240}

        >>
        >>Thread 3 (Thread 17508):
        >>#0 0x00007f379f7da1fc in pthread_cond_wait@@GLIBC_2.3.2 () from
        >>/lib/libpthread.so.0
        >>No symbol table info available.
        >>
        >>#1 0x0000000000413c57 in assoc_maintenance_thread (arg=<value optimized
        >>out>) at assoc.c:220
        >> ii = 0
        >>#2 0x00007f379f7d58ca in start_thread () from /lib/libpthread.so.0
        >>No symbol table info available.
        >>#3 0x00007f379f53c86d in clone () from /lib/libc.so.6
        >>No symbol table info available.
        >>#4 0x0000000000000000 in ?? ()
        >>No symbol table info available.
        >>
        >>Thread 2 (Thread 17504):
        >>#0 0x00007f379f53ce63 in epoll_wait () from /lib/libc.so.6
        >>No symbol table info available.
        >>#1 0x000000000044a972 in epoll_dispatch (base=0x1dfc000, tv=<value
        >>optimized out>) at epoll.c:404
        >> epollop = 0x1dfc280
        >> events = 0x1dfc2a0
        >> i = <value optimized out>
        >> res = <value optimized out>
        >> timeout = <value optimized out>
        >> _func_ = "epoll_dispatch"
        >>#2 0x0000000000444bef in event_base_loop (base=0x1dfc000, flags=<value
        >>optimized out>) at event.c:1558
        >> evsel = 0x49f200
        >> tv =

        {tv_sec = 0, tv_usec = 138529}

        >> tv_p = 0x7f379f46be80
        >> res = <value optimized out>
        >> retval = <value optimized out>
        >> _func_ = "event_base_loop"
        >>#3 0x0000000000414b15 in worker_libevent (arg=<value optimized out>) at
        >>thread.c:272
        >> me = 0x1decac0
        >>#4 0x00007f379f7d58ca in start_thread () from /lib/libpthread.so.0
        >>No symbol table info available.
        >>#5 0x00007f379f53c86d in clone () from /lib/libc.so.6
        >>No symbol table info available.
        >>#6 0x0000000000000000 in ?? ()
        >>No symbol table info available.
        >>
        >>Thread 1 (Thread 17505):
        >>#0 0x0000000000000000 in ?? ()
        >>No symbol table info available.
        >>#1 0x0000000000000000 in ?? ()
        >>No symbol table info available.

        Show
        steve Steve Yen added a comment - - edited >>Here is the stack strace. Let me know if you need anything else and >>thanks for responding! >> >>GNU gdb (GDB) 7.0.1-debian >>Copyright (C) 2009 Free Software Foundation, Inc. >>License GPLv3+: GNU GPL version 3 or later >>< http://gnu.org/licenses/gpl.html > >>This is free software: you are free to change and redistribute it. >>There is NO WARRANTY, to the extent permitted by law. Type "show >>copying" >>and "show warranty" for details. >>This GDB was configured as "x86_64-linux-gnu". >>For bug reporting instructions, please see: >>< http://www.gnu.org/software/gdb/bugs/ >... >>Reading symbols from /opt/couchbase/bin/moxi...done. >> [New Thread 17504] >> [New Thread 17508] >> [New Thread 17503] >> [New Thread 17506] >> [New Thread 17509] >> [New Thread 17507] >> [New Thread 17510] >> >>warning: Can't read pathname for load map: Input/output error. >>Reading symbols from /usr/lib/libz.so.1...(no debugging symbols >>found)...done. >>Loaded symbols for /usr/lib/libz.so.1 >>Reading symbols from /lib/librt.so.1...(no debugging symbols >>found)...done. >>Loaded symbols for /lib/librt.so.1 >>Reading symbols from /lib/libm.so.6...(no debugging symbols >>found)...done. >>Loaded symbols for /lib/libm.so.6 >>Reading symbols from /lib/libpthread.so.0...(no debugging symbols >>found)...done. >>Loaded symbols for /lib/libpthread.so.0 >>Reading symbols from /lib/libc.so.6...(no debugging symbols >>found)...done. >>Loaded symbols for /lib/libc.so.6 >>Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols >>found)...done. >>Loaded symbols for /lib64/ld-linux-x86-64.so.2 >>Core was generated by `/opt/couchbase/bin/moxi -Z >>port_listen=11211,default_bucket_name=default,downst'. >>Program terminated with signal 11, Segmentation fault. >>#0 0x0000000000000000 in ?? () >>(gdb) thread apply all bt full >> >>Thread 8 (Thread 17510): >>#0 0x00007f379f53058d in read () from /lib/libc.so.6 >>No symbol table info available. >>#1 0x00007f379f4daeb8 in _IO_file_underflow () from /lib/libc.so.6 >>No symbol table info available. >>#2 0x00007f379f4dc59e in _IO_default_uflow () from /lib/libc.so.6 >>No symbol table info available. >>#3 0x00007f379f4d3a7b in getc () from /lib/libc.so.6 >>No symbol table info available. >>#4 0x0000000000430fc4 in check_stdin_thread (arg=<value optimized out>) >>at stdin_check.c:18 >> ch = -512 >>#5 0x00007f379f7d58ca in start_thread () from /lib/libpthread.so.0 >>No symbol table info available. >>#6 0x00007f379f53c86d in clone () from /lib/libc.so.6 >>No symbol table info available. >>#7 0x0000000000000000 in ?? () >>No symbol table info available. >> >>Thread 7 (Thread 17507): >>#0 0x00007f379f53ce63 in epoll_wait () from /lib/libc.so.6 >>No symbol table info available. >>#1 0x000000000044a972 in epoll_dispatch (base=0x1e092d0, tv=<value >>optimized out>) at epoll.c:404 >> epollop = 0x1e09550 >> events = 0x216f550 >> i = <value optimized out> >> res = <value optimized out> >> timeout = <value optimized out> >> _ func _ = "epoll_dispatch" >>#2 0x0000000000444bef in event_base_loop (base=0x1e092d0, flags=<value >>optimized out>) at event.c:1558 >> evsel = 0x49f200 >> tv = {tv_sec = 0, tv_usec = 40644} >> tv_p = 0x7f379dc68e80 >> res = <value optimized out> >> retval = <value optimized out> >> _ func _ = "event_base_loop" >>#3 0x0000000000414b15 in worker_libevent (arg=<value optimized out>) at >>thread.c:272 >> me = 0x1df5250 >>#4 0x00007f379f7d58ca in start_thread () from /lib/libpthread.so.0 >>No symbol table info available. >>#5 0x00007f379f53c86d in clone () from /lib/libc.so.6 >>No symbol table info available. >>#6 0x0000000000000000 in ?? () >> >>No symbol table info available. >> >>Thread 6 (Thread 17509): >>#0 0x00007f379f531913 in poll () from /lib/libc.so.6 >>No symbol table info available. >>#1 0x0000000000452ed6 in Curl_socket_ready () >>No symbol table info available. >>#2 0x0000000000469f49 in Curl_do_perform () >>No symbol table info available. >>#3 0x000000000044c8b1 in run_rest_conflate (arg=0x1e15670) at rest.c:293 >> urls = 0x1e0e2b0 >>"http://127.0.0.1:8091/pools/default/saslBucketsStreaming" >> next = 0x0 >> userpass = <value optimized out> >> curl_error_string = '\000' <repeats 255 times> >> conf = <value optimized out> >> c = <value optimized out> >> _ PRETTY_FUNCTION _ = "run_rest_conflate" >> curl_handle = <value optimized out> >>#4 0x00007f379f7d58ca in start_thread () from /lib/libpthread.so.0 >>No symbol table info available. >>#5 0x00007f379f53c86d in clone () from /lib/libc.so.6 >>No symbol table info available. >>#6 0x0000000000000000 in ?? () >>No symbol table info available. >> >>Thread 5 (Thread 17506): >>#0 0x00007f379f53ce63 in epoll_wait () from /lib/libc.so.6 >>No symbol table info available. >>#1 0x000000000044a972 in epoll_dispatch (base=0x1e04be0, tv=<value >>optimized out>) at epoll.c:404 >> epollop = 0x1e04e60 >> events = 0x1e04e80 >> i = <value optimized out> >> res = <value optimized out> >> timeout = <value optimized out> >> _ func _ = "epoll_dispatch" >>#2 0x0000000000444bef in event_base_loop (base=0x1e04be0, flags=<value >>optimized out>) at event.c:1558 >> evsel = 0x49f200 >> tv = {tv_sec = 0, tv_usec = 130286} >> tv_p = 0x7f379e469e80 >> res = <value optimized out> >> retval = <value optimized out> >> _ func _ = "event_base_loop" >>#3 0x0000000000414b15 in worker_libevent (arg=<value optimized out>) at >>thread.c:272 >> >> me = 0x1df2520 >>#4 0x00007f379f7d58ca in start_thread () from /lib/libpthread.so.0 >>No symbol table info available. >>#5 0x00007f379f53c86d in clone () from /lib/libc.so.6 >>No symbol table info available. >>#6 0x0000000000000000 in ?? () >>No symbol table info available. >> >>Thread 4 (Thread 17503): >>#0 0x00007f379f53ce63 in epoll_wait () from /lib/libc.so.6 >>No symbol table info available. >>#1 0x000000000044a972 in epoll_dispatch (base=0x1de91c0, tv=<value >>optimized out>) at epoll.c:404 >> epollop = 0x1de9440 >> events = 0x1de9460 >> i = <value optimized out> >> res = <value optimized out> >> timeout = <value optimized out> >> _ func _ = "epoll_dispatch" >>#2 0x0000000000444bef in event_base_loop (base=0x1de91c0, flags=<value >>optimized out>) at event.c:1558 >> evsel = 0x49f200 >> tv = {tv_sec = 0, tv_usec = 127827} >> tv_p = 0x7fffd745a910 >> res = <value optimized out> >> retval = <value optimized out> >> _ func _ = "event_base_loop" >>#3 0x000000000040e295 in main (argc=12, argv=0x7fffd745baf8) at >>memcached.c:5061 >> c = <value optimized out> >> lock_memory = false >> do_daemonize = false >> check_stdin = true >> maxcore = 0 >> username = 0x0 >> pid_file = 0x0 >> cproxy_cfg = 0x1de9130 "a-certificates.crt" >> cproxy_behavior = 0x1de9010 "\r\n" >> pw = <value optimized out> >> log_file = 0x0 >> log_mode = <value optimized out> >> rlim = {rlim_cur = 10240, rlim_max = 10240} >> >>Thread 3 (Thread 17508): >>#0 0x00007f379f7da1fc in pthread_cond_wait@@GLIBC_2.3.2 () from >>/lib/libpthread.so.0 >>No symbol table info available. >> >>#1 0x0000000000413c57 in assoc_maintenance_thread (arg=<value optimized >>out>) at assoc.c:220 >> ii = 0 >>#2 0x00007f379f7d58ca in start_thread () from /lib/libpthread.so.0 >>No symbol table info available. >>#3 0x00007f379f53c86d in clone () from /lib/libc.so.6 >>No symbol table info available. >>#4 0x0000000000000000 in ?? () >>No symbol table info available. >> >>Thread 2 (Thread 17504): >>#0 0x00007f379f53ce63 in epoll_wait () from /lib/libc.so.6 >>No symbol table info available. >>#1 0x000000000044a972 in epoll_dispatch (base=0x1dfc000, tv=<value >>optimized out>) at epoll.c:404 >> epollop = 0x1dfc280 >> events = 0x1dfc2a0 >> i = <value optimized out> >> res = <value optimized out> >> timeout = <value optimized out> >> _ func _ = "epoll_dispatch" >>#2 0x0000000000444bef in event_base_loop (base=0x1dfc000, flags=<value >>optimized out>) at event.c:1558 >> evsel = 0x49f200 >> tv = {tv_sec = 0, tv_usec = 138529} >> tv_p = 0x7f379f46be80 >> res = <value optimized out> >> retval = <value optimized out> >> _ func _ = "event_base_loop" >>#3 0x0000000000414b15 in worker_libevent (arg=<value optimized out>) at >>thread.c:272 >> me = 0x1decac0 >>#4 0x00007f379f7d58ca in start_thread () from /lib/libpthread.so.0 >>No symbol table info available. >>#5 0x00007f379f53c86d in clone () from /lib/libc.so.6 >>No symbol table info available. >>#6 0x0000000000000000 in ?? () >>No symbol table info available. >> >>Thread 1 (Thread 17505): >>#0 0x0000000000000000 in ?? () >>No symbol table info available. >>#1 0x0000000000000000 in ?? () >>No symbol table info available.
        Hide
        ingenthr Matt Ingenthron added a comment -

        Added a core file from the crash in question.

        Show
        ingenthr Matt Ingenthron added a comment - Added a core file from the crash in question.
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        we need more information to reproduce this scenario.

        1- is this during rebalancing ?
        2- whats the usage pattern ? multi-get , multi-set ?
        3- is this a regression from 1.7 ?
        4- standalone moxi or server side ?

        Show
        farshid Farshid Ghods (Inactive) added a comment - we need more information to reproduce this scenario. 1- is this during rebalancing ? 2- whats the usage pattern ? multi-get , multi-set ? 3- is this a regression from 1.7 ? 4- standalone moxi or server side ?
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        please provide the exact set of params you have used to start moxi as a standalone service

        Show
        farshid Farshid Ghods (Inactive) added a comment - please provide the exact set of params you have used to start moxi as a standalone service
        Hide
        ingenthr Matt Ingenthron added a comment -

        From Dennis:

        1- is this during rebalancing ? No
        2- whats the usage pattern ? multi-get , multi-set ? Right now we are in the process of a two week conversion from memcache. So we are doing sets to both environments but only gets to the old one. After the two weeks we will disable the old servers and do both sets and gets on the couchbase servers. So right now we are only doing single sets.
        3- is this a regression from 1.7 ? Unknown we never ran 1.7
        4- standalone moxi or server side ? Server side moxi

        This is not Ubuntu but Debian Squeeze 6.0.4. I talked to Matt and I will be changing over to Ubuntu 10.04 so we are on a supported platform. You may want to wait to work on this until I can confirm it is still happening on Ubuntu.

        I tried logging into your jira system with my forums username and password but I could not. So I am replying via email.

        Show
        ingenthr Matt Ingenthron added a comment - From Dennis: 1- is this during rebalancing ? No 2- whats the usage pattern ? multi-get , multi-set ? Right now we are in the process of a two week conversion from memcache. So we are doing sets to both environments but only gets to the old one. After the two weeks we will disable the old servers and do both sets and gets on the couchbase servers. So right now we are only doing single sets. 3- is this a regression from 1.7 ? Unknown we never ran 1.7 4- standalone moxi or server side ? Server side moxi This is not Ubuntu but Debian Squeeze 6.0.4. I talked to Matt and I will be changing over to Ubuntu 10.04 so we are on a supported platform. You may want to wait to work on this until I can confirm it is still happening on Ubuntu. I tried logging into your jira system with my forums username and password but I could not. So I am replying via email.
        Hide
        ingenthr Matt Ingenthron added a comment -

        The bug reporter got back to us and said when he went to Ubuntu from Debian, the problem seems to have gone away. Closing this as won't fix given that Debian is not currently a supported platform.

        For posterity, there's nothing anti-Debian about this, it's just pro-testing-what-we-support and can't test everything.

        Show
        ingenthr Matt Ingenthron added a comment - The bug reporter got back to us and said when he went to Ubuntu from Debian, the problem seems to have gone away. Closing this as won't fix given that Debian is not currently a supported platform. For posterity, there's nothing anti-Debian about this, it's just pro-testing-what-we-support and can't test everything.
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        some more information:

        moxi: cproxy.c:1185: cproxy_release_downstream: Assertion `!found' failed.
        Aborted

        I ran a strace as well:

        27818 21:13:36.274459 epoll_ctl(37, EPOLL_CTL_ADD, 183, {EPOLLOUT, {u32=183, u64=183}}) = 0
        27818 21:13:36.274539 write(2, "moxi: cproxy.c:1185: cproxy_rele"..., 75) = 75
        27818 21:13:36.274584 rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
        27818 21:13:36.274626 tgkill(27813, 27818, SIGABRT) = 0
        27818 21:13:36.274664 — SIGABRT (Aborted) @ 0 (0) —
        27817 21:13:36.274775 <... ???? resumed> ) = ? <unavailable>
        27813 21:13:36.274788 <... ???? resumed> ) = ? <unavailable>
        27820 21:13:36.274826 +++ killed by SIGABRT +++
        27817 21:13:36.274835 +++ killed by SIGABRT +++
        27816 21:13:36.274840 +++ killed by SIGABRT +++
        27815 21:13:36.274846 +++ killed by SIGABRT +++
        27819 21:13:36.285302 +++ killed by SIGABRT +++
        27813 21:13:36.285332 +++ killed by SIGABRT +++

        moxi.cfg:

        Show
        farshid Farshid Ghods (Inactive) added a comment - some more information: moxi: cproxy.c:1185: cproxy_release_downstream: Assertion `!found' failed. Aborted I ran a strace as well: 27818 21:13:36.274459 epoll_ctl(37, EPOLL_CTL_ADD, 183, {EPOLLOUT, {u32=183, u64=183}}) = 0 27818 21:13:36.274539 write(2, "moxi: cproxy.c:1185: cproxy_rele"..., 75) = 75 27818 21:13:36.274584 rt_sigprocmask(SIG_UNBLOCK, [ABRT] , NULL, 8) = 0 27818 21:13:36.274626 tgkill(27813, 27818, SIGABRT) = 0 27818 21:13:36.274664 — SIGABRT (Aborted) @ 0 (0) — 27817 21:13:36.274775 <... ???? resumed> ) = ? <unavailable> 27813 21:13:36.274788 <... ???? resumed> ) = ? <unavailable> 27820 21:13:36.274826 +++ killed by SIGABRT +++ 27817 21:13:36.274835 +++ killed by SIGABRT +++ 27816 21:13:36.274840 +++ killed by SIGABRT +++ 27815 21:13:36.274846 +++ killed by SIGABRT +++ 27819 21:13:36.285302 +++ killed by SIGABRT +++ 27813 21:13:36.285332 +++ killed by SIGABRT +++ moxi.cfg:
        Hide
        steve Steve Yen added a comment -

        On the previous comment from Farshid on the assert in cproxy.c, I think this fix can help things, but not fully proven as I don't have a reproducing case...

        http://review.couchbase.org/#change,14423

        Show
        steve Steve Yen added a comment - On the previous comment from Farshid on the assert in cproxy.c, I think this fix can help things, but not fully proven as I don't have a reproducing case... http://review.couchbase.org/#change,14423
        Hide
        steve Steve Yen added a comment -

        The workaround until 1.8.1 is increasing the downstream_conn_max.

        The fix in 1.8.1 addresses the bug (unreproduced, so far) by backtracing pathways leading to the assert() – found a logical issue (not clearing the registered timeout) that theoretically might lead to the assert().

        Show
        steve Steve Yen added a comment - The workaround until 1.8.1 is increasing the downstream_conn_max. The fix in 1.8.1 addresses the bug (unreproduced, so far) by backtracing pathways leading to the assert() – found a logical issue (not clearing the registered timeout) that theoretically might lead to the assert().

          People

          • Assignee:
            steve Steve Yen
            Reporter:
            steve Steve Yen
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes