Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-49594

memcached crashed in CheckpointManager::extractItemsToExpel(std::lock_guard<std::mutex> const&) ()

    XMLWordPrintable

Details

    Description

      Steps:

      1. Step 1: Create a 4 node cluster
      2. Step 2: Create required buckets and collections.
      3. Step 3: Create 625000 items sequentially
      4. Step 4: Update 625000 RandonKey keys to create 50 percent fragmentation
      5. Step 5: Create 625000 items sequentially
      6. Step 6: Update 625000 RandonKey keys to create 50 percent fragmentation
      7. Step 7: Rebalance in with Loading of docs. Abort ad resume rebl at 20%, 40%, 60%, 80%
      8. Step 8: Crash Magma/memc with Loading of docs
      9. Step 9: Rebalance Out with Loading of docs. Abort ad resume rebl at 20%, 40%, 60%, 80%
      10. Step 10: Crash Magma/memc with Loading of docs
      11. Step 11: Rebalance In_Out with Loading of docs. Abort ad resume rebl at 20%, 40%, 60%, 80%
      12. Step 12: Crash Magma/memc with Loading of docs
      13. Step 13: Swap with Loading of docs. Abort ad resume rebl at 20%. Seems like rebalance isn't aborted and completed to 100%. Node going out is successful. Crashes are seen on 2 nodes post rebalance completion.

      Node 172.23.106.236

      /opt/couchbase/bin/minidump-2-core /opt/couchbase/var/lib/couchbase/crash/5ea39663-f89c-49ce-79cabeb9-2e63b255.dmp > /opt/couchbase/var/lib/couchbase/crash/5ea39663-f89c-49ce-79cabeb9-2e63b255.core
       
      /opt/couchbase/bin/minidump-2-core /opt/couchbase/var/lib/couchbase/crash/5ea39663-f89c-49ce-79cabeb9-2e63b255.dmp > /opt/couchbase/var/lib/couchbase/crash/5ea39663-f89c-49ce-79cabeb9-2e63b255.core
       
      gdb --batch /opt/couchbase/bin/memcached -c /opt/couchbase/var/lib/couchbase/crash/5ea39663-f89c-49ce-79cabeb9-2e63b255.core -ex "bt full" -ex quit
       
      gdb --batch /opt/couchbase/bin/memcached -c /opt/couchbase/var/lib/couchbase/crash/5ea39663-f89c-49ce-79cabeb9-2e63b255.core -ex "bt full" -ex quit
       
      Core was generated by `/opt/couchbase/bin/memcached -C /opt/couchbase/var/lib/couchbase/config/memcach'.
       #0  0x00007f4d31e49337 in raise () from /lib64/libc.so.6
       #0  0x00007f4d31e49337 in raise () from /lib64/libc.so.6
       No symbol table info available.
       #1  0x00007f4d31e4aa28 in abort () from /lib64/libc.so.6
       No symbol table info available.
       #2  0x00007f4d3279463c in __gnu_cxx::__verbose_terminate_handler() [clone .cold] () from /opt/couchbase/bin/../lib/libstdc++.so.6
       No symbol table info available.
       #3  0x0000000000afdaeb in backtrace_terminate_handler() ()
       No symbol table info available.
       #4  0x00007f4d3279f8f6 in __cxxabiv1::__terminate(void (*)()) () from /opt/couchbase/bin/../lib/libstdc++.so.6
       No symbol table info available.
       #5  0x00007f4d3279f961 in std::terminate() () from /opt/couchbase/bin/../lib/libstdc++.so.6
       No symbol table info available.
       #6  0x00007f4d3279fbf4 in __cxa_throw () from /opt/couchbase/bin/../lib/libstdc++.so.6
       No symbol table info available.
       #7  0x0000000000534090 in void cb::throwWithTrace<std::underflow_error>(std::underflow_error const&) ()
       No symbol table info available.
       #8  0x00000000008a93e8 in cb::ThrowExceptionUnderflowPolicy<unsigned long>::underflow ()
       No symbol table info available.
       #9  0x00000000008a9cd8 in Checkpoint::MemoryCounter::operator-=(unsigned long) ()
       No symbol table info available.
       #10 0x00000000008ac0d0 in Checkpoint::expelItems(CheckpointIterator<boost::container::list<SingleThreadedRCPtr<Item, Item*, std::default_delete<Item> >, MemoryTrackingAllocator<SingleThreadedRCPtr<Item, Item*, std::default_delete<Item> > > > > const&, unsigned long) ()
       No symbol table info available.
       #11 0x00000000007c3eba in CheckpointManager::extractItemsToExpel(std::lock_guard<std::mutex> const&) ()
       No symbol table info available.
       #12 0x00000000007c440b in CheckpointManager::expelUnreferencedCheckpointItems() ()
       No symbol table info available.
       #13 0x00000000007c736c in ClosedUnrefCheckpointRemoverTask::attemptItemExpelling() ()
       No symbol table info available.
       #14 0x00000000007c83a8 in ClosedUnrefCheckpointRemoverTask::run() ()
       No symbol table info available.
       #15 0x0000000000a6e3a2 in GlobalTask::execute() ()
       No symbol table info available.
       #16 0x0000000000a6b5b5 in FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::{lambda()#2}::operator()() const ()
       No symbol table info available.
       #17 0x0000000000bbfe60 in folly::ThreadPoolExecutor::runTask(std::shared_ptr<folly::ThreadPoolExecutor::Thread> const&, folly::ThreadPoolExecutor::Task&&) ()
       No symbol table info available.
       #18 0x0000000000ba7c1a in folly::CPUThreadPoolExecutor::threadRun(std::shared_ptr<folly::ThreadPoolExecutor::Thread>) ()
       No symbol table info available.
       #19 0x0000000000bc2e19 in void folly::detail::function::FunctionTraits<void ()>::callBig<std::_Bind<void (folly::ThreadPoolExecutor::*(folly::ThreadPoolExecutor*, std::shared_ptr<folly::ThreadPoolExecutor::Thread>))(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)> >(folly::detail::function::Data&) ()
       No symbol table info available.
       #20 0x0000000000a6b244 in void folly::detail::function::FunctionTraits<void ()>::callBig<CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}>(folly::detail::function::Data&) ()
       No symbol table info available.
       #21 0x00007f4d327c8d40 in execute_native_thread_routine () from /opt/couchbase/bin/../lib/libstdc++.so.6
       No symbol table info available.
       #22 0x00007f4d345d0e65 in start_thread () from /lib64/libpthread.so.0
       No symbol table info available.
       #23 0x00007f4d31f1188d in clone () from /lib64/libc.so.6
       No symbol table info available.
      

      Node 172.23.121.78

      Stack Trace of first crash - 96bb6c00-0b9f-4886-7f5c71a5-c3c93d05.dmp
      Core was generated by `/opt/couchbase/bin/memcached -C /opt/couchbase/var/lib/couchbase/config/memcach'.
       #0  0x00007f933af2d337 in raise () from /lib64/libc.so.6
       #0  0x00007f933af2d337 in raise () from /lib64/libc.so.6
       No symbol table info available.
       #1  0x00007f933af2ea28 in abort () from /lib64/libc.so.6
       No symbol table info available.
       #2  0x00007f933b87863c in __gnu_cxx::__verbose_terminate_handler() [clone .cold] () from /opt/couchbase/bin/../lib/libstdc++.so.6
       No symbol table info available.
       #3  0x0000000000afdaeb in backtrace_terminate_handler() ()
       No symbol table info available.
       #4  0x00007f933b8838f6 in __cxxabiv1::__terminate(void (*)()) () from /opt/couchbase/bin/../lib/libstdc++.so.6
       No symbol table info available.
       #5  0x00007f933b883961 in std::terminate() () from /opt/couchbase/bin/../lib/libstdc++.so.6
       No symbol table info available.
       #6  0x00007f933b883bf4 in __cxa_throw () from /opt/couchbase/bin/../lib/libstdc++.so.6
       No symbol table info available.
       #7  0x0000000000534090 in void cb::throwWithTrace<std::underflow_error>(std::underflow_error const&) ()
       No symbol table info available.
       #8  0x00000000008a93e8 in cb::ThrowExceptionUnderflowPolicy<unsigned long>::underflow ()
       No symbol table info available.
       #9  0x00000000008a9cd8 in Checkpoint::MemoryCounter::operator-=(unsigned long) ()
       No symbol table info available.
       #10 0x00000000008ac0d0 in Checkpoint::expelItems(CheckpointIterator<boost::container::list<SingleThreadedRCPtr<Item, Item*, std::default_delete<Item> >, MemoryTrackingAllocator<SingleThreadedRCPtr<Item, Item*, std::default_delete<Item> > > > > const&, unsigned long) ()
       No symbol table info available.
       #11 0x00000000007c3eba in CheckpointManager::extractItemsToExpel(std::lock_guard<std::mutex> const&) ()
       No symbol table info available.
       #12 0x00000000007c440b in CheckpointManager::expelUnreferencedCheckpointItems() ()
       No symbol table info available.
       #13 0x00000000007c736c in ClosedUnrefCheckpointRemoverTask::attemptItemExpelling() ()
       No symbol table info available.
       #14 0x00000000007c83a8 in ClosedUnrefCheckpointRemoverTask::run() ()
       No symbol table info available.
       #15 0x0000000000a6e3a2 in GlobalTask::execute() ()
       No symbol table info available.
       #16 0x0000000000a6b5b5 in FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::{lambda()#2}::operator()() const ()
       No symbol table info available.
       #17 0x0000000000bbfe60 in folly::ThreadPoolExecutor::runTask(std::shared_ptr<folly::ThreadPoolExecutor::Thread> const&, folly::ThreadPoolExecutor::Task&&) ()
       No symbol table info available.
       #18 0x0000000000ba7c1a in folly::CPUThreadPoolExecutor::threadRun(std::shared_ptr<folly::ThreadPoolExecutor::Thread>) ()
       No symbol table info available.
       #19 0x0000000000bc2e19 in void folly::detail::function::FunctionTraits<void ()>::callBig<std::_Bind<void (folly::ThreadPoolExecutor::*(folly::ThreadPoolExecutor*, std::shared_ptr<folly::ThreadPoolExecutor::Thread>))(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)> >(folly::detail::function::Data&) ()
       No symbol table info available.
       #20 0x0000000000a6b244 in void folly::detail::function::FunctionTraits<void ()>::callBig<CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}>(folly::detail::function::Data&) ()
       No symbol table info available.
       #21 0x00007f933b8acd40 in execute_native_thread_routine () from /opt/couchbase/bin/../lib/libstdc++.so.6
       No symbol table info available.
       #22 0x00007f933d6b4e65 in start_thread () from /lib64/libpthread.so.0
       No symbol table info available.
       #23 0x00007f933aff588d in clone () from /lib64/libc.so.6
       No symbol table info available.
      

      QE Test

      guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/magma_temp_job1.ini -p bucket_storage=magma,bucket_eviction_policy=fullEviction,rerun=False -t aGoodDoctor.Hospital.Murphy.SystemTestMagma,nodes_init=4,graceful=True,skip_cleanup=True,num_items=625000,num_buckets=1,bucket_names=GleamBook,doc_size=1024,bucket_type=membase,eviction_policy=fullEviction,iterations=10,batch_size=1000,sdk_timeout=60,log_level=debug,infra_log_level=debug,rerun=False,skip_cleanup=True,key_size=18,randomize_doc_size=False,randomize_value=True,assert_crashes_on_load=True,num_collections=50,maxttl=10,num_indexes=50,pc=25,index_nodes=0,cbas_nodes=0,fts_nodes=0,ops_rate=30000,ramQuota=2048,doc_ops=create:update:delete:read,rebl_ops_rate=20000,key_type=RandomKey,vbuckets=128,mutation_perc=100 -m rest'
      

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            ritesh.agarwal Ritesh Agarwal
            ritesh.agarwal Ritesh Agarwal
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              PagerDuty