Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-3461

memcached crashes when vbucketmigrator reuses tap stream name

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Blocker
    • 1.7.0
    • 1.6.5.2
    • couchbase-bucket
    • Security Level: Public
    • None
    • Ubuntu 10.04 64bit

    Description

      Install ms 1.6.5.2r-5 on ubuntu 10.04 64bit (node 52)
      Load 6M keys with size 2K
      Install ms 1.6.5.2r-5 on node 53 and 55
      Add node 53 to node 52
      Rebalance.
      When rebalance finished, kill vbucketmigrator process several times.
      Replication works as expected.
      Then add node 55 to cluster
      Rebalance.
      When finish rebalance, kill vbucketmigrator process many times.
      Then the membase server crash (run out of disk space) at node 52
      Free some disk space on node 52
      Membase server start again.
      Kill vbucketmigrator several times again.
      Notice that the 4 database files increase in size after each vbucketmigrator killed
      I see the error be low when I run stats all on node 52
      http://screencast.com/t/wHYwOmdx

      from alk:

      here's backtraces from core dump:

      (gdb) thread apply all bt

      Thread 12 (Thread 93):
      #0 0x55572430 in __kernel_vsyscall ()
      #1 0x556c36f6 in epoll_wait () at ../sysdeps/unix/syscall-template.S:82
      #2 0x555a5c97 in ?? () from /usr/lib/libevent-1.4.so.2
      #3 0x55598c5a in event_base_loop () from /usr/lib/libevent-1.4.so.2
      #4 0x0805df2c in worker_libevent ()
      #5 0x555e3955 in start_thread (arg=0x5659ab70) at pthread_create.c:300
      #6 0x556c2e7e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

      Thread 11 (Thread 90):
      #0 0x55572430 in __kernel_vsyscall ()
      #1 0x556c36f6 in epoll_wait () at ../sysdeps/unix/syscall-template.S:82
      #2 0x555a5c97 in ?? () from /usr/lib/libevent-1.4.so.2
      #3 0x55598c5a in event_base_loop () from /usr/lib/libevent-1.4.so.2
      #4 0x0805df2c in worker_libevent ()
      #5 0x555e3955 in start_thread (arg=0x55f97b70) at pthread_create.c:300
      #6 0x556c2e7e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

      Thread 10 (Thread 115):
      #0 0x55572430 in __kernel_vsyscall ()
      #1 0x555e8482 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
      at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/pthread_cond_timedwait.S:179
      #2 0x5660051c in SyncObject::wait (this=0xafad9c8, tv=...) at syncobject.hh:42
      #3 0x565ffa90 in IdleTask::run (this=0xafc9d38, d=..., t=...) at dispatcher.cc:220
      #4 0x565fede1 in Dispatcher::run (this=0xafad9c0) at dispatcher.cc:116
      #5 0x565fe658 in launch_dispatcher_thread (arg=0xafad9c0) at dispatcher.cc:26
      #6 0x555e3955 in start_thread (arg=0x56e47b70) at pthread_create.c:300
      #7 0x556c2e7e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

      Thread 9 (Thread 114):
      #0 0x55572430 in __kernel_vsyscall ()
      #1 0x555e8482 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
      --Type <return> to continue, or q <return> to quit--
      at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/pthread_cond_timedwait.S:179
      #2 0x5660051c in SyncObject::wait (this=0xa2bcd70, tv=...) at syncobject.hh:42
      #3 0x565ffa90 in IdleTask::run (this=0xa2bce70, d=..., t=...) at dispatcher.cc:220
      #4 0x565fede1 in Dispatcher::run (this=0xa2bcd68) at dispatcher.cc:116
      #5 0x565fe658 in launch_dispatcher_thread (arg=0xa2bcd68) at dispatcher.cc:26
      #6 0x555e3955 in start_thread (arg=0x56c46b70) at pthread_create.c:300
      #7 0x556c2e7e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

      Thread 8 (Thread 113):
      #0 0x55572430 in __kernel_vsyscall ()
      #1 0x555e7f7f in __pthread_cond_wait (cond=0xafc6270, mutex=0xafc6254) at pthread_cond_wait.c:153
      #2 0x566003ce in SyncObject::wait (this=0xafc6250) at syncobject.hh:31
      #3 0x565feb24 in Dispatcher::run (this=0xafc6248) at dispatcher.cc:82
      #4 0x565fe658 in launch_dispatcher_thread (arg=0xafc6248) at dispatcher.cc:26
      #5 0x555e3955 in start_thread (arg=0x56a45b70) at pthread_create.c:300
      #6 0x556c2e7e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

      Thread 7 (Thread 92):
      #0 0x55572430 in __kernel_vsyscall ()
      #1 0x556c36f6 in epoll_wait () at ../sysdeps/unix/syscall-template.S:82
      #2 0x555a5c97 in ?? () from /usr/lib/libevent-1.4.so.2
      #3 0x55598c5a in event_base_loop () from /usr/lib/libevent-1.4.so.2
      #4 0x0805df2c in worker_libevent ()
      #5 0x555e3955 in start_thread (arg=0x56399b70) at pthread_create.c:300
      #6 0x556c2e7e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

      Thread 6 (Thread 91):
      #0 0x55572430 in __kernel_vsyscall ()
      #1 0x556c36f6 in epoll_wait () at ../sysdeps/unix/syscall-template.S:82
      #2 0x555a5c97 in ?? () from /usr/lib/libevent-1.4.so.2
      #3 0x55598c5a in event_base_loop () from /usr/lib/libevent-1.4.so.2
      #4 0x0805df2c in worker_libevent ()
      #5 0x555e3955 in start_thread (arg=0x56198b70) at pthread_create.c:300
      --Type <return> to continue, or q <return> to quit--
      #6 0x556c2e7e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

      Thread 5 (Thread 89):
      #0 0x55572430 in __kernel_vsyscall ()
      #1 0x556c36f6 in epoll_wait () at ../sysdeps/unix/syscall-template.S:82
      #2 0x555a5c97 in ?? () from /usr/lib/libevent-1.4.so.2
      #3 0x55598c5a in event_base_loop () from /usr/lib/libevent-1.4.so.2
      #4 0x0805df2c in worker_libevent ()
      #5 0x555e3955 in start_thread (arg=0x55d96b70) at pthread_create.c:300
      #6 0x556c2e7e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

      Thread 4 (Thread 88):
      #0 0x55572430 in __kernel_vsyscall ()
      #1 0x5568e516 in nanosleep () at ../sysdeps/unix/syscall-template.S:82
      #2 0x5568e340 in __sleep (seconds=0) at ../sysdeps/unix/sysv/linux/sleep.c:138
      #3 0x08061595 in check_isasl_db_thread ()
      #4 0x555e3955 in start_thread (arg=0x55b95b70) at pthread_create.c:300
      #5 0x556c2e7e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

      Thread 3 (Thread 87):
      #0 0x55572430 in __kernel_vsyscall ()
      #1 0x556b3f5b in read () at ../sysdeps/unix/syscall-template.S:82
      #2 0x5565e11b in _IO_new_file_underflow (fp=0x55739420) at fileops.c:606
      #3 0x5565f9bb in _IO_default_uflow (fp=0x55739420) at genops.c:440
      #4 0x55660de8 in __uflow (fp=0x55739420) at genops.c:394
      #5 0x556566dc in _IO_getc (fp=0x55739420) at getc.c:41
      #6 0x555756ed in check_stdin_thread () from ./bin/memcached/stdin_term_handler.so
      #7 0x555e3955 in start_thread (arg=0x5596ab70) at pthread_create.c:300
      #8 0x556c2e7e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

      Thread 2 (Thread 83):
      #0 0x55572430 in __kernel_vsyscall ()
      #1 0x556c36f6 in epoll_wait () at ../sysdeps/unix/syscall-template.S:82
      --Type <return> to continue, or q <return> to quit--
      #2 0x555a5c97 in ?? () from /usr/lib/libevent-1.4.so.2
      #3 0x55598c5a in event_base_loop () from /usr/lib/libevent-1.4.so.2
      #4 0x0805d453 in main ()

      Thread 1 (Thread 116):
      #0 0x0805eeab in notify_io_complete ()
      #1 0x56660298 in std::pointer_to_binary_function<void const*, ENGINE_ERROR_CODE, void>::operator()(const void *, <anonymous enum>) const (this=0x5704828c, __x=0x4, __y=ENGINE_SUCCESS) at /usr/include/c++/4.4/bits/stl_function.h:457
      #2 0x5665f774 in std::binder2nd<std::pointer_to_binary_function<void const*, ENGINE_ERROR_CODE, void> >::operator() (
      this=0x5704828c, __x=@0x57f5c238) at /usr/include/c++/4.4/backward/binders.h:153
      #3 0x5665e685 in std::for_each<std::List_iterator<void const*>, std::binder2nd<std::pointer_to_binary_function<void const*, ENGINE_ERROR_CODE, void> > > (_first=..., __last=..., __f=...) at /usr/include/c++/4.4/bits/stl_algo.h:4200
      #4 0x5665da63 in EventuallyPersistentEngine::notifyIOComplete<std::list<void const*, std::allocator<void const*> > >(std::list<void const*, std::allocator<void const*> >, <anonymous enum>) (this=0x9489d48, cookies=..., status=ENGINE_SUCCESS) at ep_engine.h:333
      #5 0x5665c65f in TapConnMap::notifyIOThreadMain (this=0x948a024, engine=0x9489d48) at tapconnmap.cc:295
      #6 0x56628cb4 in EventuallyPersistentEngine::notifyTapIoThread (this=0x9489d48) at ep_engine.cc:2820
      #7 0x56621bae in EvpNotifyTapIo (arg=0x9489d48) at ep_engine.cc:731
      #8 0x555e3955 in start_thread (arg=0x57048b70) at pthread_create.c:300
      #9 0x556c2e7e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130
      (gdb)

      SIGSEGV was delivered to Thread 1

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            dustin Dustin Sallings (Inactive)
            thuan Thuan Nguyen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty