Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-59518

ActiveDurabilityMonitor::commit vb:595 failed with status: no such key

    XMLWordPrintable

Details

    Description

      Steps to reproduce

      1. Created a 4 node kv cluster
      2. Created a couchstore bucket named 'default' and loaded 520000 items onto it
      3. Started a create workload with durability=MAJORITY
      4. Failover node ns_1@172.23.109.84
      5. Started rebalance

      Rebalance fails with the following reason

      [user:error,2023-11-06T22:09:15.786-08:00,ns_1@172.23.109.7:<0.2756.0>:ns_orchestrator:log_rebalance_completion:1433]Rebalance exited with reason {mover_crashed,                              {unexpected_exit,                               {'EXIT',<0.8263.1>,                                {{wait_seqno_persisted_failed,"default",1014,                                  1036,                                  [{'ns_1@172.23.109.7',                                    {'EXIT',                                     {{{badmatch,{error,closed}},                                       [{mc_client_binary,cmd_vocal_recv,5,                                         [{file,"src/mc_client_binary.erl"},                                          {line,151}]},                                        {mc_client_binary,                                         wait_for_seqno_persistence,3,                                         [{file,"src/mc_client_binary.erl"},                                          {line,742}]},                                        {ns_memcached,                                         '-wait_for_seqno_persistence/3-fun-0-',                                         3,                                         [{file,"src/ns_memcached.erl"},                                          {line,1473}]},                                        {ns_memcached,                                         '-perform_very_long_call/3-fun-0-',2,                                         [{file,"src/ns_memcached.erl"},                                          {line,338}]},                                        {ns_memcached_sockets_pool,                                         '-executing_on_socket/3-fun-0-',3,                                         [{file,                                           "src/ns_memcached_sockets_pool.erl"},                                          {line,86}]},                                        {async,'-async_init/4-fun-1-',3,                                         [{file,"src/async.erl"},                                          {line,191}]}]},                                      {gen_server,call,                                       [{'janitor_agent-default',                                         'ns_1@172.23.109.7'},                                        {if_rebalance,<0.15925.0>,                                         {wait_seqno_persisted,1014,1036}},                                        infinity]}}}}]},                                 [{ns_single_vbucket_mover,                                   '-wait_seqno_persisted_many/5-fun-2-',5,                                   [{file,"src/ns_single_vbucket_mover.erl"},                                    {line,474}]},                                  {proc_lib,init_p,3,                                   [{file,"proc_lib.erl"},{line,211}]}]}}}}.Rebalance Operation Id = b655fb6e66fa020f937b266a243eaa51 

      grep "CRITICAL" on memcached.log on 172.23.109.7

      grep " CRITICAL " memcached.log
       
       
      2023-11-06T22:09:15.555387-08:00 CRITICAL *** Fatal error encountered during exception handling ***
      2023-11-06T22:09:15.555448-08:00 CRITICAL Caught unhandled std::exception-derived exception. what(): ActiveDurabilityMonitor::commit vb:595 failed with status: no such key
      2023-11-06T22:09:15.795878-08:00 CRITICAL Detected previous crash
      2023-11-06T22:09:15.795920-08:00 CRITICAL Breakpad caught a crash (Couchbase version 7.2.3-6705). Writing crash dump to /opt/couchbase/var/lib/couchbase/crash/53b8ef68-9d2b-4ddd-6e9bacb0-492be93e.dmp before terminating.
      2023-11-06T22:09:15.795923-08:00 CRITICAL Stack backtrace of crashed thread:
      2023-11-06T22:09:15.795923-08:00 CRITICAL    #0  /opt/couchbase/bin/memcached() [0x400000+0x74dc88]
      2023-11-06T22:09:15.795924-08:00 CRITICAL    #1  /opt/couchbase/bin/memcached(_ZN15google_breakpad16ExceptionHandler12GenerateDumpEPNS0_12CrashContextE+0x3ea) [0x400000+0x79f7ba]
      2023-11-06T22:09:15.795926-08:00 CRITICAL    #2  /opt/couchbase/bin/memcached(_ZN15google_breakpad16ExceptionHandler13SignalHandlerEiP9siginfo_tPv+0xb8) [0x400000+0x79faf8]
      2023-11-06T22:09:15.795996-08:00 CRITICAL    #3  /lib/x86_64-linux-gnu/libpthread.so.0() [0x7f6451c2c000+0x12730]
      2023-11-06T22:09:15.796006-08:00 CRITICAL    #4  /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x10b) [0x7f644fd6f000+0x378eb]
      2023-11-06T22:09:15.796007-08:00 CRITICAL    #5  /lib/x86_64-linux-gnu/libc.so.6(abort+0x121) [0x7f644fd6f000+0x22535]
      2023-11-06T22:09:15.796010-08:00 CRITICAL    #6  /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f64500ce000+0xa89ab]
      2023-11-06T22:09:15.796010-08:00 CRITICAL    #7  /opt/couchbase/bin/memcached() [0x400000+0x7581fb]
      2023-11-06T22:09:15.796010-08:00 CRITICAL    #8  /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f64500ce000+0xb82fa]
      2023-11-06T22:09:15.796011-08:00 CRITICAL    #9  /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f64500ce000+0xb8365]
      2023-11-06T22:09:15.796012-08:00 CRITICAL    #10 /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f64500ce000+0xb85b7]
      2023-11-06T22:09:15.796014-08:00 CRITICAL    #11 /opt/couchbase/bin/memcached() [0x400000+0xcef17]
      2023-11-06T22:09:15.796014-08:00 CRITICAL    #12 /opt/couchbase/bin/memcached() [0x400000+0xccda1]
      2023-11-06T22:09:15.796016-08:00 CRITICAL    #13 /opt/couchbase/bin/memcached() [0x400000+0x4251b2]
      2023-11-06T22:09:15.796016-08:00 CRITICAL    #14 /opt/couchbase/bin/memcached() [0x400000+0x33b174]
      2023-11-06T22:09:15.796016-08:00 CRITICAL    #15 /opt/couchbase/bin/memcached() [0x400000+0x467928]
      2023-11-06T22:09:15.796017-08:00 CRITICAL    #16 /opt/couchbase/bin/memcached() [0x400000+0x6c6819]
      2023-11-06T22:09:15.796115-08:00 CRITICAL    #17 /opt/couchbase/bin/memcached() [0x400000+0x6bfefa]
      2023-11-06T22:09:15.796116-08:00 CRITICAL    #18 /opt/couchbase/bin/memcached() [0x400000+0x6c7bce]
      2023-11-06T22:09:15.796117-08:00 CRITICAL    #19 /opt/couchbase/bin/memcached() [0x400000+0x825d10]
      2023-11-06T22:09:15.796118-08:00 CRITICAL    #20 /opt/couchbase/bin/memcached() [0x400000+0x8107aa]
      2023-11-06T22:09:15.796120-08:00 CRITICAL    #21 /opt/couchbase/bin/memcached() [0x400000+0x828cc9]
      2023-11-06T22:09:15.796120-08:00 CRITICAL    #22 /opt/couchbase/bin/memcached() [0x400000+0x6bfbf4]
      2023-11-06T22:09:15.796121-08:00 CRITICAL    #23 /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f64500ce000+0xe4aa3]
      2023-11-06T22:09:15.796122-08:00 CRITICAL    #24 /lib/x86_64-linux-gnu/libpthread.so.0() [0x7f6451c2c000+0x7fa3]
      2023-11-06T22:09:15.796124-08:00 CRITICAL    #25 /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f644fd6f000+0xf906f] 

      grep "CRITICAL" on babysitter.log on 172.23.109.7

      grep "CRITICAL" ns_server.babysitter.log
      \memcached<0.134.0>: 2023-11-06T22:09:15.555387-08:00 CRITICAL *** Fatal error encountered during exception handling ***
      memcached<0.134.0>: 2023-11-06T22:09:15.555448-08:00 CRITICAL Caught unhandled std::exception-derived exception. what(): ActiveDurabilityMonitor::commit vb:595 failed with status: no such key
      memcached<0.134.0>: CRITICAL Breakpad caught a crash (Couchbase version 7.2.3-6705). Writing crash dump to /opt/couchbase/var/lib/couchbase/crash/53b8ef68-9d2b-4ddd-6e9bacb0-492be93e.dmp before terminating.
                                   <<"CRITICAL Breakpad caught a crash (Couchbase version "...>>,
                                   <<"2023-11-06T22:09:15.555448-08:00 CRITICAL Caught"...>>,
                                   <<"2023-11-06T22:09:15.555387-08:00 CRITICAL **"...>>,
      [ns_server:info,2023-11-06T22:09:15.996-08:00,babysitter_of_ns_1@cb.local:<0.384.0>:ns_port_server:log:226]memcached<0.384.0>: 2023-11-06T22:09:15.795878-08:00 CRITICAL Detected previous crash
      memcached<0.384.0>: 2023-11-06T22:09:15.795920-08:00 CRITICAL Breakpad caught a crash (Couchbase version 7.2.3-6705). Writing crash dump to /opt/couchbase/var/lib/couchbase/crash/53b8ef68-9d2b-4ddd-6e9bacb0-492be93e.dmp before terminating.
      memcached<0.384.0>: 2023-11-06T22:09:15.795923-08:00 CRITICAL Stack backtrace of crashed thread:
      memcached<0.384.0>: 2023-11-06T22:09:15.795923-08:00 CRITICAL    #0  /opt/couchbase/bin/memcached() [0x400000+0x74dc88]
      memcached<0.384.0>: 2023-11-06T22:09:15.795924-08:00 CRITICAL    #1  /opt/couchbase/bin/memcached(_ZN15google_breakpad16ExceptionHandler12GenerateDumpEPNS0_12CrashContextE+0x3ea) [0x400000+0x79f7ba]
      memcached<0.384.0>: 2023-11-06T22:09:15.795926-08:00 CRITICAL    #2  /opt/couchbase/bin/memcached(_ZN15google_breakpad16ExceptionHandler13SignalHandlerEiP9siginfo_tPv+0xb8) [0x400000+0x79faf8]
      memcached<0.384.0>: 2023-11-06T22:09:15.795996-08:00 CRITICAL    #3  /lib/x86_64-linux-gnu/libpthread.so.0() [0x7f6451c2c000+0x12730]
      memcached<0.384.0>: 2023-11-06T22:09:15.796006-08:00 CRITICAL    #4  /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x10b) [0x7f644fd6f000+0x378eb]
      memcached<0.384.0>: 2023-11-06T22:09:15.796007-08:00 CRITICAL    #5  /lib/x86_64-linux-gnu/libc.so.6(abort+0x121) [0x7f644fd6f000+0x22535]
      memcached<0.384.0>: 2023-11-06T22:09:15.796010-08:00 CRITICAL    #6  /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f64500ce000+0xa89ab]
      memcached<0.384.0>: 2023-11-06T22:09:15.796010-08:00 CRITICAL    #7  /opt/couchbase/bin/memcached() [0x400000+0x7581fb]
      memcached<0.384.0>: 2023-11-06T22:09:15.796010-08:00 CRITICAL    #8  /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f64500ce000+0xb82fa]
      memcached<0.384.0>: 2023-11-06T22:09:15.796011-08:00 CRITICAL    #9  /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f64500ce000+0xb8365]
      memcached<0.384.0>: 2023-11-06T22:09:15.796012-08:00 CRITICAL    #10 /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f64500ce000+0xb85b7]
      memcached<0.384.0>: 2023-11-06T22:09:15.796014-08:00 CRITICAL    #11 /opt/couchbase/bin/memcached() [0x400000+0xcef17]
      memcached<0.384.0>: 2023-11-06T22:09:15.796014-08:00 CRITICAL    #12 /opt/couchbase/bin/memcached() [0x400000+0xccda1]
      memcached<0.384.0>: 2023-11-06T22:09:15.796016-08:00 CRITICAL    #13 /opt/couchbase/bin/memcached() [0x400000+0x4251b2]
      memcached<0.384.0>: 2023-11-06T22:09:15.796016-08:00 CRITICAL    #14 /opt/couchbase/bin/memcached() [0x400000+0x33b174]
      memcached<0.384.0>: 2023-11-06T22:09:15.796016-08:00 CRITICAL    #15 /opt/couchbase/bin/memcached() [0x400000+0x467928]
      memcached<0.384.0>: 2023-11-06T22:09:15.796017-08:00 CRITICAL    #16 /opt/couchbase/bin/memcached() [0x400000+0x6c6819]
      memcached<0.384.0>: 2023-11-06T22:09:15.796115-08:00 CRITICAL    #17 /opt/couchbase/bin/memcached() [0x400000+0x6bfefa]
      memcached<0.384.0>: 2023-11-06T22:09:15.796116-08:00 CRITICAL    #18 /opt/couchbase/bin/memcached() [0x400000+0x6c7bce]
      memcached<0.384.0>: 2023-11-06T22:09:15.796117-08:00 CRITICAL    #19 /opt/couchbase/bin/memcached() [0x400000+0x825d10]
      memcached<0.384.0>: 2023-11-06T22:09:15.796118-08:00 CRITICAL    #20 /opt/couchbase/bin/memcached() [0x400000+0x8107aa]
      memcached<0.384.0>: 2023-11-06T22:09:15.796120-08:00 CRITICAL    #21 /opt/couchbase/bin/memcached() [0x400000+0x828cc9]
      memcached<0.384.0>: 2023-11-06T22:09:15.796120-08:00 CRITICAL    #22 /opt/couchbase/bin/memcached() [0x400000+0x6bfbf4]
      memcached<0.384.0>: 2023-11-06T22:09:15.796121-08:00 CRITICAL    #23 /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f64500ce000+0xe4aa3]
      memcached<0.384.0>: 2023-11-06T22:09:15.796122-08:00 CRITICAL    #24 /lib/x86_64-linux-gnu/libpthread.so.0() [0x7f6451c2c000+0x7fa3]
      memcached<0.384.0>: 2023-11-06T22:09:15.796124-08:00 CRITICAL    #25 /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f644fd6f000+0xf906f] 

      172.23.109.7: Stack Trace of first crash - 53b8ef68-9d2b-4ddd-6e9bacb0-492be93e.dmp 53b8ef68-9d2b-4ddd-6e9bacb0-492be93e.dmp

       


      TAF Script to reproduce

      guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /data/workspace/debian-p0-durability-vset00-00-tunable_rebalance_swap_majority_6.5_P0/testexec.32030.ini process_concurrency=8,sdk_timeout=60,num_items=100000,doc_size=256,GROUP=P0;durability,EXCLUDE_GROUP=non_dgm;not_for_majority,durability=MAJORITY,active_resident_threshold=70,get-cbcollect-info=True,infra_log_level=info,log_level=info,bucket_storage=couchstore,upgrade_version=7.2.3-6705,sirius_url=http://172.23.120.103:4000 -t rebalance_new.swaprebalancetests.SwapRebalanceFailedTests.test_add_back_failed_node,doc_size=256,new_replica=2,standard_buckets=1,process_concurrency=8,upgrade_version=7.2.3-6705,num-swap=1,GROUP=P0;durability,EXCLUDE_GROUP=non_dgm;not_for_majority,sdk_timeout=60,get-cbcollect-info=True,replicas=1,durability=MAJORITY,active_resident_threshold=70,log_level=info,bucket_storage=couchstore,nodes_init=4,num_items=100000,sirius_url=http://172.23.120.103:4000,infra_log_level=info' 

      Job name : durability-tunable_rebalance_swap_majority_6.5_P0

      Job ref link : http://cb-logs-qe.s3-website-us-west-2.amazonaws.com/7.2.3-6705/jenkins_logs/test_suite_executor-TAF/282643/

       

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              raghav.sk Raghav S K
              raghav.sk Raghav S K
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty