Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-44079

Ephemeral out of order purging can cause prepares to be recommitted and DurabilityMonitor montonicity exceptions to throw

    XMLWordPrintable

Details

    Description

      Script to Repo

      guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/testexec.116702.ini GROUP=durability_majority,rerun=False,skip_log_scan=False,get-cbcollect-info=False,infra_log_level=critical,log_level=error,bucket_storage=couchstore,upgrade_version=7.0.0-4362 -t bucket_collections.collections_rebalance.CollectionsRebalance.test_data_load_collections_with_rebalance_in,nodes_init=3,nodes_in=2,override_spec_params=durability;replicas,durability=MAJORITY,replicas=2,bucket_spec=multi_bucket.buckets_for_rebalance_tests,data_load_stage=during,skip_validations=False,GROUP=durability_majority'
      

      The test rebalances-in 2 nodes to init_nodes of 3, with daurability=MAJORITY level data load in parallel
      Rebalance fails (ST attached)

      1 minidump is seen on .57 node

      2021-02-02 18:24:39,107 | test  | CRITICAL | MainThread | [basetestcase:check_coredump_exist:728] 172.23.123.57: 1 core dump seen
      

      Checking memcached.log on .57

      2021-02-02T18:24:20.031995-08:00 CRITICAL *** Fatal error encountered during exception handling ***
      2021-02-02T18:24:20.032058-08:00 CRITICAL Caught unhandled std::exception-derived exception. what(): std::exception
      2021-02-02T18:24:20.055523-08:00 INFO 1464: (bucket2) DCP (Consumer) eq_dcpq:replication:ns_1@172.23.123.68->ns_1@172.23.123.57:bucket2 - (vb:544) Attempting to add stream: opaque_:123, start_seqno_:67, end_seqno_:18446744073709551615, vb_uuid:85106092359276, snap_start_seqno_:67, snap_end_seqno_:67, last_seqno:67, stream_req_value:{"uid":"7"}

      2021-02-02T18:24:20.272001-08:00 CRITICAL Breakpad caught a crash (Couchbase version 7.0.0-4362). Writing crash dump to /opt/couchbase/var/lib/couchbase/crash/c9271b0b-61a6-463f-1314aea1-609387e4.dmp before terminating.
      2021-02-02T18:24:20.272032-08:00 CRITICAL Stack backtrace of crashed thread:
      2021-02-02T18:24:20.272286-08:00 CRITICAL     /opt/couchbase/bin/memcached() [0x400000+0x145bbd]
      2021-02-02T18:24:20.272299-08:00 CRITICAL     /opt/couchbase/bin/memcached(_ZN15google_breakpad16ExceptionHandler12GenerateDumpEPNS0_12CrashContextE+0x3ea) [0x400000+0x15b3fa]
      2021-02-02T18:24:20.272309-08:00 CRITICAL     /opt/couchbase/bin/memcached(_ZN15google_breakpad16ExceptionHandler13SignalHandlerEiP9siginfo_tPv+0xb8) [0x400000+0x15b738]
      2021-02-02T18:24:20.272393-08:00 CRITICAL     /lib64/libpthread.so.0() [0x7f4b54fff000+0xf5f0]
      2021-02-02T18:24:20.272417-08:00 CRITICAL     /lib64/libc.so.6(gsignal+0x37) [0x7f4b54c31000+0x36337]
      2021-02-02T18:24:20.272436-08:00 CRITICAL     /lib64/libc.so.6(abort+0x148) [0x7f4b54c31000+0x37a28]
      2021-02-02T18:24:20.272473-08:00 CRITICAL     /opt/couchbase/bin/../lib/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x125) [0x7f4b55734000+0x91195]
      2021-02-02T18:24:20.272489-08:00 CRITICAL     /opt/couchbase/bin/memcached() [0x400000+0x155632]
      2021-02-02T18:24:20.272504-08:00 CRITICAL     /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f4b55734000+0x8ef86]
      2021-02-02T18:24:20.272519-08:00 CRITICAL     /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f4b55734000+0x8efd1]
      2021-02-02T18:24:20.272534-08:00 CRITICAL     /opt/couchbase/bin/../lib/libep.so() [0x7f4b59061000+0x16eb23]
      2021-02-02T18:24:20.272543-08:00 CRITICAL     /opt/couchbase/bin/../lib/libep.so() [0x7f4b59061000+0x168b82]
      2021-02-02T18:24:20.272553-08:00 CRITICAL     /opt/couchbase/bin/../lib/libep.so() [0x7f4b59061000+0x2e7be6]
      2021-02-02T18:24:20.272562-08:00 CRITICAL     /opt/couchbase/bin/../lib/libep.so() [0x7f4b59061000+0x2d00da]
      2021-02-02T18:24:20.272571-08:00 CRITICAL     /opt/couchbase/bin/../lib/libep.so() [0x7f4b59061000+0x2ead09]
      2021-02-02T18:24:20.272580-08:00 CRITICAL     /opt/couchbase/bin/../lib/libep.so() [0x7f4b59061000+0x166fc3]
      2021-02-02T18:24:20.272606-08:00 CRITICAL     /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7f4b55734000+0xb9dcf]
      2021-02-02T18:24:20.272611-08:00 CRITICAL     /lib64/libpthread.so.0() [0x7f4b54fff000+0x7e65]
      2021-02-02T18:24:20.272640-08:00 CRITICAL     /lib64/libc.so.6(clone+0x6d) [0x7f4b54c31000+0xfe88d]
      

      and also on .68 node

      2021-02-02 18:24:37,249 | test  | CRITICAL | MainThread | [basetestcase:check_coredump_exist:801] 172.23.123.68: Found 'exception occurred in runloop' logs - ['2021-02-02T18:24:20.259974-08:00 WARNING 1598: exception occurred in runloop during packet execution. Closing connection: PassiveDurabilityMonitor::completeSyncWrite vb:141 No tracked, but received commit for key <ud>cid:0x8:test_collections-359</ud>. Cookies: [{"aiostat":"success","connection":"[ {\\"ip\\":\\"127.0.0.1\\",\\"port\\":60264} - {\\"ip\\":\\"127.0.0.1\\",\\"port\\":11209} (<ud>@ns_server</ud>) ]","engine_storage":"0x0000000000000000","ewouldblock":false,"packet":{"bodylen":37,"cas":0,"datatype":"raw","extlen":16,"key":"<ud>.test_collections-359</ud>","keylen":21,"magic":"ClientRequest","opaque":47,"opcode":"DCP_COMMIT","vbucket":141},"refcount":1}]\n']
      

      Attachments

        1. 172.23.123.57_thread_dump.rtf
          48 kB
          Ritesh Agarwal
        2. bt_full_c9271b0b-61a6-463f-1314aea1-609387e4.dmp.txt
          179 kB
          Sumedh Basarkod
        3. rebalance_failure_ST.txt
          9 kB
          Sumedh Basarkod

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              sumedh.basarkod Sumedh Basarkod (Inactive)
              sumedh.basarkod Sumedh Basarkod (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty