Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-58135

Reading a corrupted access.log file during warmup raises unhandled exception which can cause the Data Service to restart

    XMLWordPrintable

Details

    • Untriaged
    • 0
    • Unknown
    • KV 2023-4

    Description

      Summary

      If an access.log file becomes corrupted - potentially due to the filesystem running out of space during writing - and the bucket requires warmup before the access.log is successfully re-written, then upon warmup the corruption is detected but in at least one code path is not caught, resulting in the uncaught exception terminating the memcached process.

      Details

      The following backtrace was seen after disk became full:

      2023-08-01T07:06:03.651434-04:00 CRITICAL *** Fatal error encountered during exception handling ***
      2023-08-01T07:06:03.651512-04:00 CRITICAL Caught unhandled std::exception-derived exception. what(): MutationLogEntryV3::newEntry: magic (which is 50) is not equal to 71
      

      Backtrace:

      (gdb) bt
      #0  0x00007f73ec7c0387 in raise () from /lib64/libc.so.6
      #1  0x00007f73ec7c1a78 in abort () from /lib64/libc.so.6
      #2  0x00007f73ed10b63c in __gnu_cxx::__verbose_terminate_handler () at /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/libsupc++/vterminate.cc:95
      #3  0x0000000000b3259b in backtrace_terminate_handler () at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/utilities/terminate_handler.cc:88
      #4  0x00007f73ed1168f6 in __cxxabiv1::__terminate (handler=<optimized out>) at /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/libsupc++/eh_terminate.cc:48
      #5  0x00007f73ed116961 in std::terminate () at /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/libsupc++/eh_terminate.cc:58
      #6  0x00007f73ed116bf4 in __cxxabiv1::__cxa_throw (obj=obj@entry=0x7f72f0000df0, tinfo=0x107ef28 <typeinfo for std::invalid_argument>, dest=0x443d90 <std::invalid_argument::~invalid_argument()@plt>) at /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/libsupc++/eh_throw.cc:95
      #7  0x0000000000835458 in MutationLogEntryV3::newEntry (itr=..., buflen=140131897888400) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/mutation_log_entry.h:485
      #8  0x0000000000832deb in MutationLog::iterator::prepItem (this=0x7f72fffed030) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/mutation_log.cc:902
      #9  0x00000000008337fb in MutationLog::iterator::operator++ (this=0x7f72fffed030) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/mutation_log.cc:764
      #10 MutationLogHarvester::loadBatch (this=this@entry=0x7f72fffed080, start=..., limit=10000) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/mutation_log.cc:1026
      #11 0x000000000073c7ca in Warmup::doWarmup (this=0x7f72bc23f000, lf=..., vbmap=..., cb=...) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/warmup.cc:1804
      #12 0x0000000000743c6c in Warmup::loadingAccessLog (this=0x7f72bc23f000, shardId=<optimized out>) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/warmup.cc:1735
      #13 0x000000000074ca11 in WarmupLoadAccessLog::run (this=0x7f727c0e0150) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/warmup.cc:674
      #14 0x0000000000aa1029 in GlobalTask::execute (this=0x7f727c0e0150, threadName=...) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/executor/globaltask.cc:98
      #15 0x0000000000a9a6ea in FollyExecutorPool::TaskProxy::scheduleViaCPUPool()::{lambda()#2}::operator()() const (__closure=0x7f72fffed650) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/executor/folly_executorpool.cc:309
      #16 0x0000000000aa23ce in folly::detail::function::FunctionTraits<void ()>::operator()() (this=0x7f72fffed650) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/executor/cancellable_cpu_executor.cc:42
      #17 operator() (__closure=<optimized out>) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/executor/cancellable_cpu_executor.cc:42
      #18 0x0000000000bf9ad0 in folly::detail::function::FunctionTraits<void ()>::operator()() (this=0x7f72fffed840) at /home/couchbase/jenkins/workspace/cbdeps-platform-build-old/deps/packages/build/folly/folly-prefix/src/folly/folly/Function.h:416
      #19 folly::ThreadPoolExecutor::runTask (this=this@entry=0x7f73ea3bdd00, thread=..., task=...) at /home/couchbase/jenkins/workspace/cbdeps-platform-build-old/deps/packages/build/folly/folly-prefix/src/folly/folly/executors/ThreadPoolExecutor.cpp:97
      #20 0x0000000000be456a in folly::CPUThreadPoolExecutor::threadRun (this=0x7f73ea3bdd00, thread=...) at /home/couchbase/jenkins/workspace/cbdeps-platform-build-old/deps/packages/build/folly/folly-prefix/src/folly/folly/executors/CPUThreadPoolExecutor.cpp:265
      #21 0x0000000000bfca89 in std::__invoke_impl<void, void (folly::ThreadPoolExecutor::*&)(std::shared_ptr<folly::ThreadPoolExecutor::Thread>), folly::ThreadPoolExecutor*&, std::shared_ptr<folly::ThreadPoolExecutor::Thread>&> (__t=<optimized out>, __f=<optimized out>) at /usr/local/include/c++/7.3.0/bits/invoke.h:73
      #22 std::__invoke<void (folly::ThreadPoolExecutor::*&)(std::shared_ptr<folly::ThreadPoolExecutor::Thread>), folly::ThreadPoolExecutor*&, std::shared_ptr<folly::ThreadPoolExecutor::Thread>&> (__fn=<optimized out>) at /usr/local/include/c++/7.3.0/bits/invoke.h:95
      #23 std::_Bind<void (folly::ThreadPoolExecutor::*(folly::ThreadPoolExecutor*, std::shared_ptr<folly::ThreadPoolExecutor::Thread>))(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)>::__call<void, , 0ul, 1ul>(std::tuple<>&&, std::_Index_tuple<0ul, 1ul>) (__args=..., this=<optimized out>) at /usr/local/include/c++/7.3.0/functional:467
      #24 std::_Bind<void (folly::ThreadPoolExecutor::*(folly::ThreadPoolExecutor*, std::shared_ptr<folly::ThreadPoolExecutor::Thread>))(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)>::operator()<, void>() (this=<optimized out>) at /usr/local/include/c++/7.3.0/functional:551
      #25 folly::detail::function::FunctionTraits<void ()>::callBig<std::_Bind<void (folly::ThreadPoolExecutor::*(folly::ThreadPoolExecutor*, std::shared_ptr<folly::ThreadPoolExecutor::Thread>))(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)> >(folly::detail::function::Data&) (p=...)
          at /home/couchbase/jenkins/workspace/cbdeps-platform-build-old/deps/packages/build/folly/folly-prefix/src/folly/folly/Function.h:401
      #26 0x0000000000a9a3e4 in folly::detail::function::FunctionTraits<void ()>::operator()() (this=0x7f73ea3bc440) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/executor/folly_executorpool.cc:49
      #27 CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}::operator()() (__closure=0x7f73ea3bc440) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/executor/folly_executorpool.cc:49
      #28 folly::detail::function::FunctionTraits<void ()>::callBig<CBRegisteredThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}>(folly::detail::function::Data&) (p=...) at /home/couchbase/jenkins/workspace/couchbase-server-unix/server_build/tlm/deps/folly.exploded/include/folly/Function.h:401
      #29 0x00007f73ed13fd40 in std::execute_native_thread_routine (__p=0x7f73ea3a4d50) at /tmp/deploy/objdir/../gcc-10.2.0/libstdc++-v3/src/c++11/thread.cc:80
      #30 0x00007f73eef47ea5 in start_thread () from /lib64/libpthread.so.0
      

      i.e during Warmup when loading the access log, an entry was encountered with an invalid "magic" field (i.e. marker written to indicate the type of access log entry).

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            ashwin.govindarajulu Ashwin Govindarajulu
            drigby Dave Rigby (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty