Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-37161

[Upgrade] failover upgrade failed during rebalance last upgrade node in cluster.

    XMLWordPrintable

Details

    • Untriaged
    • Centos 64-bit
    • Yes
    • KV-Engine Mad-Hatter GA, KV Sprint 2019-12

    Description

      Steps to reproduce

      • Install Couchbase server 5.0.0 on 3 centos 7.6 servers and create cluster with services of each node (kv,index-kv,n1ql -kv,fts).
      • Create default bucket and load data to default bucket.
      • Graceful failover 1st node and upgrade node from 5.0.0 to 6.5.0-4908 (use rpm -U cbserver.rpm)
      • Set full recovery and rebalance. Rebalance success as expected.
      • Graceful failover 2nd node and upgrade node from 5.0.0 to 6.5.0-4908 (use rpm -U cbserver.rpm)
      • Set full recovery and rebalance. Rebalance success as expected.
      • Graceful failover 3rd node (last 5.0.0 one with ip 224) and upgrade node from 5.0.0 to 6.5.0-4908 (use rpm -U cbserver.rpm)
      • Set full recovery and rebalance. Rebalance failed instantly with error " 'memcached' exited with status 134."
      • Rebalance again, rebalance success

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            thuan Thuan Nguyen created issue -
            arunkumar Arunkumar Senthilnathan (Inactive) made changes -
            Field Original Value New Value
            Link This issue is duplicated by MB-37103 [ MB-37103 ]
            arunkumar Arunkumar Senthilnathan (Inactive) made changes -
            Resolution Duplicate [ 3 ]
            Status Open [ 1 ] Resolved [ 5 ]
            thuan Thuan Nguyen added a comment -

            I tried upgrade to build 6.5.0-4922. I got same issue

            thuan Thuan Nguyen added a comment - I tried upgrade to build 6.5.0-4922. I got same issue

            Thuan Nguyen in that case, please see if there are any other open bugs in MH with same issue - if not and this is consistent, please reopen and mark it as critical for dev triage

            arunkumar Arunkumar Senthilnathan (Inactive) added a comment - Thuan Nguyen in that case, please see if there are any other open bugs in MH with same issue - if not and this is consistent, please reopen and mark it as critical for dev triage
            thuan Thuan Nguyen made changes -
            thuan Thuan Nguyen made changes -
            Attachment rebalanceReport.json [ 78733 ]
            thuan Thuan Nguyen made changes -
            Priority Major [ 3 ] Critical [ 2 ]
            thuan Thuan Nguyen made changes -
            Resolution Duplicate [ 3 ]
            Status Resolved [ 5 ] Reopened [ 4 ]

            Reopen this ticket due to the failure even with upgrade to the latest build today 6.5.0-4925

            thuan Thuan Nguyen added a comment - Reopen this ticket due to the failure even with upgrade to the latest build today 6.5.0-4925
            dfinlay Dave Finlay made changes -
            Component/s couchbase-bucket [ 10173 ]
            Component/s ns_server [ 10019 ]
            dfinlay Dave Finlay made changes -
            Assignee Dave Finlay [ dfinlay ] Daniel Owen [ owend ]
            dfinlay Dave Finlay added a comment -

            Assigning to couchbase-bucket based on the "memcached exit with status 134" comment.

            dfinlay Dave Finlay added a comment - Assigning to couchbase-bucket based on the "memcached exit with status 134" comment.
            lynn.straus Lynn Straus made changes -
            Labels approved-for-mad-hatter
            owend Daniel Owen added a comment - - edited

            Seeing the following error with build 6.5.0-4922 on 172.23.121.226. (see 172.23.121.226_20191205-022439)

            GSL: Precondition failure at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/dcp/active_stream.cc: 1184
            

            The expect that is failing is as follows:

                if (sendHCS) {
                        Expects(hcsToSend.is_initialized());  <<<<< EXPECT that is failing
                        log(spdlog::level::level_enum::info,
                            "{} ActiveStream::snapshot: Sending disk snapshot with start "
                            "seqno {}, end seqno {}, and"
                            " high completed seqno {}",
                            logPrefix,
                            snapStart,
                            snapEnd,
                            hcsToSend);
                    }
            

            This looks like the fix in MB-37103 is not sufficient.

            owend Daniel Owen added a comment - - edited Seeing the following error with build 6.5.0-4922 on 172.23.121.226. (see 172.23.121.226_20191205-022439) GSL: Precondition failure at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/dcp/active_stream.cc: 1184 The expect that is failing is as follows: if (sendHCS) { Expects(hcsToSend.is_initialized()); <<<<< EXPECT that is failing log(spdlog::level::level_enum::info, "{} ActiveStream::snapshot: Sending disk snapshot with start " "seqno {}, end seqno {}, and" " high completed seqno {}", logPrefix, snapStart, snapEnd, hcsToSend); } This looks like the fix in MB-37103 is not sufficient.
            owend Daniel Owen made changes -
            Assignee Daniel Owen [ owend ] Dave Rigby [ drigby ]
            thuan Thuan Nguyen made changes -

            Uploaded logs for build 6.5.0-4925

            thuan Thuan Nguyen added a comment - Uploaded logs for build 6.5.0-4925

            The last build which passed this test is 6.5.0-4874 as in rebalance report

            thuan Thuan Nguyen added a comment - The last build which passed this test is 6.5.0-4874 as in rebalance report
            thuan Thuan Nguyen made changes -
            Attachment rebalanceReport_650-4874.json [ 78750 ]
            drigby Dave Rigby made changes -
            Status Reopened [ 4 ] In Progress [ 3 ]
            drigby Dave Rigby added a comment - - edited

            Node .226 (which experienced the crash) has an ERROR earlier, when processing a DCP_SEQNO_ACKNOWLEDGEMENT before the topology is setup, resulting in the replication stream to .225 being closed:

            2019-12-04T17:57:52.233463-08:00 ERROR 56: exception occurred in runloop during packet execution. Cookie info: [{"aiostat":"success","connection":"[ 172.23.121.225:47004 - 172.23.121.226:11209 (<ud>@ns_server</ud>) ]","engine_storage":"0x00007fca9c06a010","ewouldblock":false,"packet":{"bodylen":8,"cas":0,"datatype":"raw","extlen":8,"key":"<ud></ud>","keylen":0,"magic":"ClientRequest","opaque":1,"opcode":"DCP_SEQNO_ACKNOWLEDGED","vbucket":854},"refcount":1}] - closing connection ([ 172.23.121.225:47004 - 172.23.121.226:11209 (<ud>@ns_server</ud>) ]): ActiveDurabilityMonitor::State::processSeqnoAck vb:854 FirstChain not set
            2019-12-04T17:57:52.233542-08:00 INFO 56: (No Engine) DCP (Producer) eq_dcpq:replication:ns_1@172.23.121.226->ns_1@172.23.121.225:default - Removing connection [ 172.23.121.225:47004 - 172.23.121.226:11209 (<ud>@ns_server</ud>) ]
            2019-12-04T17:57:52.233568-08:00 WARNING 56: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.23.121.226->ns_1@172.23.121.225:default - (vb:854) Stream closing, sent until seqno 2 remaining items 0, reason: The stream closed early because the conn was disconnected
            ... all other streams on this connection also close ...
            ...
            # ns_server sets topology on all vBuckets, including vb:854 which encountered the error above:
            2019-12-04T17:57:56.845170-08:00 INFO (default) VBucket::setState: transitioning vb:854 with high seqno:2 from:active to:active meta:{"topology":[["ns_1@172.23.121.226","ns_1@172.23.121.225"]]}
            

            However I think this is relatively benign, however we should investigate it - what I believe should happen is the SEQNO_ACK message should be queued, until the topology is set.

            Back to the crash, it occurs when trying to send a DCP snapshot marker from node .226 to node .224 for vb:334:

            2019-12-04T17:58:45.836499-08:00 INFO 62: Client 172.23.121.224:45433 authenticated as <ud>@ns_server</ud>
            2019-12-04T17:58:45.842166-08:00 INFO 62: HELO [proxy] XATTR, Snappy, JSON [ 172.23.121.224:45433 - 172.23.121.226:11209 (<ud>@ns_server</ud>) ]
            2019-12-04T17:58:45.845365-08:00 INFO 62: DCP connection opened successfully. PRODUCER, INCLUDE_XATTRS, DELETE_TIMES [ 172.23.121.224:45433 - 172.23.121.226:11209 (<ud>@ns_server</ud>) ]
            2019-12-04T17:58:45.881224-08:00 INFO 62: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.23.121.226->ns_1@172.23.121.224:default - (vb:334) Creating stream with start seqno 0 and end seqno 18446744073709551615; requested end seqno was 18446744073709551615, collections-manifest uid:none, sid:none
            2019-12-04T17:58:45.881333-08:00 INFO 62: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.23.121.226->ns_1@172.23.121.224:default - (vb:334) ActiveStream::scheduleBackfill_UNLOCKED register cursor with name "eq_dcpq:replication:ns_1@172.23.121.226->ns_1@172.23.121.224:default" backfill:true, seqno:1
            2019-12-04T17:58:45.881356-08:00 INFO 62: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.23.121.226->ns_1@172.23.121.224:default - (vb:334) ActiveStream::transitionState: Transitioning from backfilling to in-memory
            2019-12-04T17:58:45.881753-08:00 CRITICAL *** Fatal error encountered during exception handling ***
            2019-12-04T17:58:45.881817-08:00 CRITICAL Caught unhandled std::exception-derived exception. what(): GSL: Precondition failure at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/dcp/active_stream.cc: 1184
            2019-12-04T17:58:46.007109-08:00 CRITICAL Breakpad caught a crash (Couchbase version 6.5.0-4922). Writing crash dump to /opt/couchbase/var/lib/couchbase/crash/579b9573-ae2a-e952-58b0a7a6-726024ee.dmp before terminating.
            2019-12-04T17:58:46.007138-08:00 CRITICAL Stack backtrace of crashed thread:
            2019-12-04T17:58:46.007421-08:00 CRITICAL     /opt/couchbase/bin/memcached() [0x400000+0x13243d]
            2019-12-04T17:58:46.007452-08:00 CRITICAL     /opt/couchbase/bin/memcached(_ZN15google_breakpad16ExceptionHandler12GenerateDumpEPNS0_12CrashContextE+0x3ce) [0x400000+0x14a47e]
            2019-12-04T17:58:46.007473-08:00 CRITICAL     /opt/couchbase/bin/memcached(_ZN15google_breakpad16ExceptionHandler13SignalHandlerEiP9siginfo_tPv+0x94) [0x400000+0x14a794]
            2019-12-04T17:58:46.007489-08:00 CRITICAL     /lib64/libpthread.so.0() [0x7fca9d7e6000+0xf5e0]
            2019-12-04T17:58:46.007538-08:00 CRITICAL     /lib64/libc.so.6(gsignal+0x37) [0x7fca9d423000+0x351f7]
            2019-12-04T17:58:46.007586-08:00 CRITICAL     /lib64/libc.so.6(abort+0x148) [0x7fca9d423000+0x368e8]
            2019-12-04T17:58:46.007662-08:00 CRITICAL     /opt/couchbase/bin/../lib/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x125) [0x7fca9df1b000+0x91195]
            2019-12-04T17:58:46.007686-08:00 CRITICAL     /opt/couchbase/bin/memcached() [0x400000+0x145d72]
            2019-12-04T17:58:46.007757-08:00 CRITICAL     /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7fca9df1b000+0x8ef86]
            2019-12-04T17:58:46.007828-08:00 CRITICAL     /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7fca9df1b000+0x8efd1]
            2019-12-04T17:58:46.007892-08:00 CRITICAL     /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7fca9df1b000+0x8f213]
            2019-12-04T17:58:46.007918-08:00 CRITICAL     /opt/couchbase/bin/../lib/../lib/ep.so() [0x7fca98a88000+0x74098]
            2019-12-04T17:58:46.007937-08:00 CRITICAL     /opt/couchbase/bin/../lib/../lib/ep.so() [0x7fca98a88000+0x77434]
            2019-12-04T17:58:46.007956-08:00 CRITICAL     /opt/couchbase/bin/../lib/../lib/ep.so() [0x7fca98a88000+0x77843]
            2019-12-04T17:58:46.007975-08:00 CRITICAL     /opt/couchbase/bin/../lib/../lib/ep.so() [0x7fca98a88000+0x77924]
            2019-12-04T17:58:46.007994-08:00 CRITICAL     /opt/couchbase/bin/../lib/../lib/ep.so() [0x7fca98a88000+0x809f9]
            2019-12-04T17:58:46.008015-08:00 CRITICAL     /opt/couchbase/bin/../lib/../lib/ep.so() [0x7fca98a88000+0x12f964]
            2019-12-04T17:58:46.008025-08:00 CRITICAL     /opt/couchbase/bin/../lib/libplatform_so.so.0.1.0() [0x7fca9fdc4000+0x8ee7]
            2019-12-04T17:58:46.008038-08:00 CRITICAL     /lib64/libpthread.so.0() [0x7fca9d7e6000+0x7e25]
            2019-12-04T17:58:46.008100-08:00 CRITICAL     /lib64/libc.so.6(clone+0x6d) [0x7fca9d423000+0xf834d]
            

            Full backtrace from minidump via gdb:

            (gdb) bt full
            #0  0x00007fca9d4581f7 in raise () from /usr/lib64/libc-2.17.so
            No symbol table info available.
            #1  0x00007fca9d4598e8 in abort () from /usr/lib64/libc-2.17.so
            No symbol table info available.
            #2  0x00007fca9dfac195 in __gnu_cxx::__verbose_terminate_handler () at /tmp/deploy/gcc-7.3.0/libstdc++-v3/libsupc++/vterminate.cc:95
                    terminating = false
                    t = <optimized out>
            #3  0x0000000000545d72 in backtrace_terminate_handler () at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/utilities/terminate_handler.cc:86
            No locals.
            #4  0x00007fca9dfa9f86 in __cxxabiv1::__terminate (handler=<optimized out>) at /tmp/deploy/gcc-7.3.0/libstdc++-v3/libsupc++/eh_terminate.cc:47
            No locals.
            #5  0x00007fca9dfa9fd1 in std::terminate () at /tmp/deploy/gcc-7.3.0/libstdc++-v3/libsupc++/eh_terminate.cc:57
            No locals.
            #6  0x00007fca9dfaa213 in __cxxabiv1::__cxa_throw (obj=obj@entry=0x7fca400009f0, tinfo=tinfo@entry=0x7fca98f9d110 <typeinfo for gsl::fail_fast>, dest=dest@entry=0x7fca98adfc90 <gsl::fail_fast::~fail_fast()>) at /tmp/deploy/gcc-7.3.0/libstdc++-v3/libsupc++/eh_throw.cc:93
                    globals = <optimized out>
                    header = 0x7fca40000970
            #7  0x00007fca98afc098 in fail_fast_assert (message=0x7fca98cf0a48 "GSL: Precondition failure at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/dcp/active_stream.cc: 1184", cond=false)
                at /home/couchbase/jenkins/workspace/couchbase-server-unix/third_party/gsl-lite/include/gsl/gsl-lite.h:473
            No locals.
            #8  ActiveStream::snapshot (this=this@entry=0x7fca68345d10, checkpointType=<optimized out>, items=..., highCompletedSeqno=...) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/dcp/active_stream.cc:1184
                    flags = <optimized out>
                    snapEnd = 6
                    hcsToSend = {<boost::optional_detail::tc_optional_base<unsigned long>> = {<boost::optional_detail::optional_tag> = {<No data fields>}, m_initialized = false, m_storage = 140508178662144}, <No data fields>}
                    isCkptTypeDisk = <optimized out>
                    snapStart = 0
                    sendHCS = true
            #9  0x00007fca98aff434 in ActiveStream::processItems (this=this@entry=0x7fca68345d10, outstandingItemsResult=..., streamMutex=...) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/dcp/active_stream.cc:1100
                    mutations = {<std::_Deque_base<std::unique_ptr<DcpResponse, std::default_delete<DcpResponse> >, std::allocator<std::unique_ptr<DcpResponse, std::default_delete<DcpResponse> > > >> = {
                        _M_impl = {<std::allocator<std::unique_ptr<DcpResponse, std::default_delete<DcpResponse> > >> = {<__gnu_cxx::new_allocator<std::unique_ptr<DcpResponse, std::default_delete<DcpResponse> > >> = {<No data fields>}, <No data fields>}, _M_map = 0x7fca687e8540, _M_map_size = 8, _M_start = {
                            _M_cur = 0x7fca62ccb200, _M_first = 0x7fca62ccb200, _M_last = 0x7fca62ccb400, _M_node = 0x7fca687e8558}, _M_finish = {_M_cur = 0x7fca62ccb220, _M_first = 0x7fca62ccb200, _M_last = 0x7fca62ccb400, _M_node = 0x7fca687e8558}}}, <No data fields>}
            #10 0x00007fca98aff843 in ActiveStream::nextCheckpointItemTask (this=this@entry=0x7fca68345d10, streamMutex=...) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/dcp/active_stream.cc:867
                    res = {checkpointType = Disk, items = {<std::_Vector_base<SingleThreadedRCPtr<Item, Item*, std::default_delete<Item> >, std::allocator<SingleThreadedRCPtr<Item, Item*, std::default_delete<Item> > > >> = {
                          _M_impl = {<std::allocator<SingleThreadedRCPtr<Item, Item*, std::default_delete<Item> > >> = {<__gnu_cxx::new_allocator<SingleThreadedRCPtr<Item, Item*, std::default_delete<Item> > >> = {<No data fields>}, <No data fields>}, _M_start = 0x7fca9c01cf80, _M_finish = 0x7fca9c01cfd0, 
                            _M_end_of_storage = 0x7fca9c01d000}}, <No data fields>}, highCompletedSeqno = {<boost::optional_detail::tc_optional_base<unsigned long>> = {<boost::optional_detail::optional_tag> = {<No data fields>}, m_initialized = false, m_storage = 140508178662144}, <No data fields>}}
                    vbucket = {<std::__shared_ptr<VBucket, (__gnu_cxx::_Lock_policy)2>> = {<std::__shared_ptr_access<VBucket, (__gnu_cxx::_Lock_policy)2, false, false>> = {<No data fields>}, _M_ptr = 0x7fca683d6200, _M_refcount = {_M_pi = 0x7fca921219c0}}, <No data fields>}
            #11 0x00007fca98aff924 in ActiveStream::nextCheckpointItemTask (this=0x7fca68345d10) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/dcp/active_stream.cc:857
                    lh = {_M_device = @0x7fca68345d78}
            #12 0x00007fca98b089f9 in ActiveStreamCheckpointProcessorTask::run (this=0x7fca90365890) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/dcp/active_stream_checkpoint_processor_task.cc:56
                    rh = {<StreamContainer<std::shared_ptr<Stream> >::Iterable<std::_Fwd_list_const_iterator<std::shared_ptr<Stream> > >> = {itr = {_M_node = 0x7fca62aac2e0}, endItr = {_M_node = 0x0}, before = {_M_node = <optimized out>}}, readLock = {_M_pm = 0x7fca9c06c478, _M_owns = <optimized out>}, 
                      container = <optimized out>}
                    streams = {<std::__shared_ptr<StreamContainer<std::shared_ptr<Stream> >, (__gnu_cxx::_Lock_policy)2>> = {<std::__shared_ptr_access<StreamContainer<std::shared_ptr<Stream> >, (__gnu_cxx::_Lock_policy)2, false, false>> = {<No data fields>}, _M_ptr = 0x7fca9c06c460, _M_refcount = {
                          _M_pi = 0x7fca9c06c450}}, <No data fields>}
                    iterations = 0
                    expected = 96
            #13 0x00007fca98bb7964 in ExecutorThread::run (this=0x7fca9bf40a40) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/executorthread.cc:187
                    curTaskDescr = {static npos = 18446744073709551615, _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x7fca92183e80 <Address 0x7fca92183e80 out of bounds>}, _M_string_length = 107, {
                        _M_local_buf = "k\000\000\000\000\000\000\000۔\003\236\312\177\000", _M_allocated_capacity = 107}}
                    woketime = <optimized out>
                    scheduleOverhead = <optimized out>
                    again = <optimized out>
                    runtime = <optimized out>
                    q = <optimized out>
                    tick = 64 '@'
            

            Here we can see that in frame #8, when preparing DCP items from CheckpointManager (in-memory phase) we are attempting to send the high_completed_seqno (sendHCS = true) but hcsToSend is not initialised (missing):

            #8  ActiveStream::snapshot (this=this@entry=0x7fca68345d10, checkpointType=<optimized out>, items=..., highCompletedSeqno=...) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/dcp/active_stream.cc:1184
                    flags = <optimized out>
                    snapEnd = 6
                    hcsToSend = {<boost::optional_detail::tc_optional_base<unsigned long>> = {<boost::optional_detail::optional_tag> = {<No data fields>}, m_initialized = false, m_storage = 140508178662144}, <No data fields>}
                    isCkptTypeDisk = <optimized out>
                    snapStart = 0
                    sendHCS = true
            

            Where did this vbucket (and specifically this Checkpoint - up to seqno:6) come from? vb:334 was previously sent from node .224 to node .226. Node .226 logs (which is running 6.5.0-4922 at this point in time):

            2019-12-04T17:49:09.353854-08:00 INFO (default) VBucket: created vb:334 with state:replica initialState:dead lastSeqno:0 persistedRange:{0,0} max_cas:0 uuid:144175708859207 topology:null
            2019-12-04T17:49:09.364325-08:00 INFO (default) VBucket::setState: transitioning vb:334 with high seqno:0 from:replica to:replica
            2019-12-04T17:49:09.371449-08:00 INFO 73: (default) DCP (Consumer) eq_dcpq:replication:ns_1@172.23.121.224->ns_1@172.23.121.226:default - (vb:334) Attempting to add stream: opaque_:18, start_seqno_:0, end_seqno_:18446744073709551615, vb_uuid:144175708859207, snap_start_seqno_:0, snap_end_seqno_:0, last_seqno:
            0, stream_req_value:{"uid":"0"}
            2019-12-04T17:54:15.377380-08:00 INFO (default) VBucket::setState: transitioning vb:334 with high seqno:6 from:replica to:pending
            2019-12-04T17:54:15.759335-08:00 INFO 73: (default) DCP (Consumer) eq_dcpq:replication:ns_1@172.23.121.224->ns_1@172.23.121.226:default - (vb:334) Setting stream to dead state, last_seqno is 6, unAckedBytes is 0, status is The stream closed due to a close stream message
            2019-12-04T17:54:15.903546-08:00 INFO 62: (default) DCP (Consumer) eq_dcpq:replication:takeover:883bb157c17a8653edc697d8a806f4f0:ns_1@172.23.121.224->ns_1@172.23.121.226:default:334 - (vb:334) Attempting to add takeover stream: opaque_:1, start_seqno_:6, end_seqno_:18446744073709551615, vb_uuid:11804952313819
            0, snap_start_seqno_:6, snap_end_seqno_:6, last_seqno:6, stream_req_value:{"uid":"0"}
            2019-12-04T17:54:15.998054-08:00 INFO (default) VBucket::setState: transitioning vb:334 with high seqno:6 from:pending to:active
            

            Looking at the producer node logs (.224) for the same time we can see the producer's view of this - note the producer is running build 5.0.0-3519 at this time:

            2019-12-04T17:49:09.375865-08:00 NOTICE 54: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.23.121.224->ns_1@172.23.121.226:default - (vb 334) Creating stream with start seqno 0 and end seqno 18446744073709551615; requested end seqno was 18446744073709551615
            2019-12-04T17:49:09.375991-08:00 NOTICE 54: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.23.121.224->ns_1@172.23.121.226:default - (vb 334) Scheduling backfill from 1 to 6, reschedule flag : False
            2019-12-04T17:49:09.376405-08:00 NOTICE 54: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.23.121.224->ns_1@172.23.121.226:default - (vb 334) Sending disk snapshot with start seqno 0 and end seqno 6
            2019-12-04T17:49:09.376737-08:00 NOTICE 54: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.23.121.224->ns_1@172.23.121.226:default - (vb 334) Backfill complete, 0 items read from disk, 4 from memory, last seqno read: 6, pendingBackfill : False
            2019-12-04T17:49:09.376978-08:00 NOTICE 54: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.23.121.224->ns_1@172.23.121.226:default - ActiveStream::transitionState: (vb 334) Transitioning from backfilling to in-memory
            

            To summarise- we crash when trying to send from node .226 a Disk Checkpoint without a HCS. This Checkpoint originally came from node .224 when it was running CB 5.0.0; as such no HCS would have been sent to .226.

            The fix is to ensure that Disk Checkpoints received from non-SyncReplication capable Producers automatically set the HCS to zero (given before SyncReplication nothing could be Committed).

            drigby Dave Rigby added a comment - - edited Node .226 (which experienced the crash) has an ERROR earlier, when processing a DCP_SEQNO_ACKNOWLEDGEMENT before the topology is setup, resulting in the replication stream to .225 being closed: 2019-12-04T17:57:52.233463-08:00 ERROR 56: exception occurred in runloop during packet execution. Cookie info: [{"aiostat":"success","connection":"[ 172.23.121.225:47004 - 172.23.121.226:11209 (<ud>@ns_server</ud>) ]","engine_storage":"0x00007fca9c06a010","ewouldblock":false,"packet":{"bodylen":8,"cas":0,"datatype":"raw","extlen":8,"key":"<ud></ud>","keylen":0,"magic":"ClientRequest","opaque":1,"opcode":"DCP_SEQNO_ACKNOWLEDGED","vbucket":854},"refcount":1}] - closing connection ([ 172.23.121.225:47004 - 172.23.121.226:11209 (<ud>@ns_server</ud>) ]): ActiveDurabilityMonitor::State::processSeqnoAck vb:854 FirstChain not set 2019-12-04T17:57:52.233542-08:00 INFO 56: (No Engine) DCP (Producer) eq_dcpq:replication:ns_1@172.23.121.226->ns_1@172.23.121.225:default - Removing connection [ 172.23.121.225:47004 - 172.23.121.226:11209 (<ud>@ns_server</ud>) ] 2019-12-04T17:57:52.233568-08:00 WARNING 56: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.23.121.226->ns_1@172.23.121.225:default - (vb:854) Stream closing, sent until seqno 2 remaining items 0, reason: The stream closed early because the conn was disconnected ... all other streams on this connection also close ... ... # ns_server sets topology on all vBuckets, including vb:854 which encountered the error above: 2019-12-04T17:57:56.845170-08:00 INFO (default) VBucket::setState: transitioning vb:854 with high seqno:2 from:active to:active meta:{"topology":[["ns_1@172.23.121.226","ns_1@172.23.121.225"]]} However I think this is relatively benign, however we should investigate it - what I believe should happen is the SEQNO_ACK message should be queued, until the topology is set. — Back to the crash, it occurs when trying to send a DCP snapshot marker from node .226 to node .224 for vb:334: 2019-12-04T17:58:45.836499-08:00 INFO 62: Client 172.23.121.224:45433 authenticated as <ud>@ns_server</ud> 2019-12-04T17:58:45.842166-08:00 INFO 62: HELO [proxy] XATTR, Snappy, JSON [ 172.23.121.224:45433 - 172.23.121.226:11209 (<ud>@ns_server</ud>) ] 2019-12-04T17:58:45.845365-08:00 INFO 62: DCP connection opened successfully. PRODUCER, INCLUDE_XATTRS, DELETE_TIMES [ 172.23.121.224:45433 - 172.23.121.226:11209 (<ud>@ns_server</ud>) ] 2019-12-04T17:58:45.881224-08:00 INFO 62: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.23.121.226->ns_1@172.23.121.224:default - (vb:334) Creating stream with start seqno 0 and end seqno 18446744073709551615; requested end seqno was 18446744073709551615, collections-manifest uid:none, sid:none 2019-12-04T17:58:45.881333-08:00 INFO 62: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.23.121.226->ns_1@172.23.121.224:default - (vb:334) ActiveStream::scheduleBackfill_UNLOCKED register cursor with name "eq_dcpq:replication:ns_1@172.23.121.226->ns_1@172.23.121.224:default" backfill:true, seqno:1 2019-12-04T17:58:45.881356-08:00 INFO 62: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.23.121.226->ns_1@172.23.121.224:default - (vb:334) ActiveStream::transitionState: Transitioning from backfilling to in-memory 2019-12-04T17:58:45.881753-08:00 CRITICAL *** Fatal error encountered during exception handling *** 2019-12-04T17:58:45.881817-08:00 CRITICAL Caught unhandled std::exception-derived exception. what(): GSL: Precondition failure at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/dcp/active_stream.cc: 1184 2019-12-04T17:58:46.007109-08:00 CRITICAL Breakpad caught a crash (Couchbase version 6.5.0-4922). Writing crash dump to /opt/couchbase/var/lib/couchbase/crash/579b9573-ae2a-e952-58b0a7a6-726024ee.dmp before terminating. 2019-12-04T17:58:46.007138-08:00 CRITICAL Stack backtrace of crashed thread: 2019-12-04T17:58:46.007421-08:00 CRITICAL /opt/couchbase/bin/memcached() [0x400000+0x13243d] 2019-12-04T17:58:46.007452-08:00 CRITICAL /opt/couchbase/bin/memcached(_ZN15google_breakpad16ExceptionHandler12GenerateDumpEPNS0_12CrashContextE+0x3ce) [0x400000+0x14a47e] 2019-12-04T17:58:46.007473-08:00 CRITICAL /opt/couchbase/bin/memcached(_ZN15google_breakpad16ExceptionHandler13SignalHandlerEiP9siginfo_tPv+0x94) [0x400000+0x14a794] 2019-12-04T17:58:46.007489-08:00 CRITICAL /lib64/libpthread.so.0() [0x7fca9d7e6000+0xf5e0] 2019-12-04T17:58:46.007538-08:00 CRITICAL /lib64/libc.so.6(gsignal+0x37) [0x7fca9d423000+0x351f7] 2019-12-04T17:58:46.007586-08:00 CRITICAL /lib64/libc.so.6(abort+0x148) [0x7fca9d423000+0x368e8] 2019-12-04T17:58:46.007662-08:00 CRITICAL /opt/couchbase/bin/../lib/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x125) [0x7fca9df1b000+0x91195] 2019-12-04T17:58:46.007686-08:00 CRITICAL /opt/couchbase/bin/memcached() [0x400000+0x145d72] 2019-12-04T17:58:46.007757-08:00 CRITICAL /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7fca9df1b000+0x8ef86] 2019-12-04T17:58:46.007828-08:00 CRITICAL /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7fca9df1b000+0x8efd1] 2019-12-04T17:58:46.007892-08:00 CRITICAL /opt/couchbase/bin/../lib/libstdc++.so.6() [0x7fca9df1b000+0x8f213] 2019-12-04T17:58:46.007918-08:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7fca98a88000+0x74098] 2019-12-04T17:58:46.007937-08:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7fca98a88000+0x77434] 2019-12-04T17:58:46.007956-08:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7fca98a88000+0x77843] 2019-12-04T17:58:46.007975-08:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7fca98a88000+0x77924] 2019-12-04T17:58:46.007994-08:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7fca98a88000+0x809f9] 2019-12-04T17:58:46.008015-08:00 CRITICAL /opt/couchbase/bin/../lib/../lib/ep.so() [0x7fca98a88000+0x12f964] 2019-12-04T17:58:46.008025-08:00 CRITICAL /opt/couchbase/bin/../lib/libplatform_so.so.0.1.0() [0x7fca9fdc4000+0x8ee7] 2019-12-04T17:58:46.008038-08:00 CRITICAL /lib64/libpthread.so.0() [0x7fca9d7e6000+0x7e25] 2019-12-04T17:58:46.008100-08:00 CRITICAL /lib64/libc.so.6(clone+0x6d) [0x7fca9d423000+0xf834d] Full backtrace from minidump via gdb: (gdb) bt full #0 0x00007fca9d4581f7 in raise () from /usr/lib64/libc-2.17.so No symbol table info available. #1 0x00007fca9d4598e8 in abort () from /usr/lib64/libc-2.17.so No symbol table info available. #2 0x00007fca9dfac195 in __gnu_cxx::__verbose_terminate_handler () at /tmp/deploy/gcc-7.3.0/libstdc++-v3/libsupc++/vterminate.cc:95 terminating = false t = <optimized out> #3 0x0000000000545d72 in backtrace_terminate_handler () at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/utilities/terminate_handler.cc:86 No locals. #4 0x00007fca9dfa9f86 in __cxxabiv1::__terminate (handler=<optimized out>) at /tmp/deploy/gcc-7.3.0/libstdc++-v3/libsupc++/eh_terminate.cc:47 No locals. #5 0x00007fca9dfa9fd1 in std::terminate () at /tmp/deploy/gcc-7.3.0/libstdc++-v3/libsupc++/eh_terminate.cc:57 No locals. #6 0x00007fca9dfaa213 in __cxxabiv1::__cxa_throw (obj=obj@entry=0x7fca400009f0, tinfo=tinfo@entry=0x7fca98f9d110 <typeinfo for gsl::fail_fast>, dest=dest@entry=0x7fca98adfc90 <gsl::fail_fast::~fail_fast()>) at /tmp/deploy/gcc-7.3.0/libstdc++-v3/libsupc++/eh_throw.cc:93 globals = <optimized out> header = 0x7fca40000970 #7 0x00007fca98afc098 in fail_fast_assert (message=0x7fca98cf0a48 "GSL: Precondition failure at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/dcp/active_stream.cc: 1184", cond=false) at /home/couchbase/jenkins/workspace/couchbase-server-unix/third_party/gsl-lite/include/gsl/gsl-lite .h:473 No locals. #8 ActiveStream::snapshot (this=this@entry=0x7fca68345d10, checkpointType=<optimized out>, items=..., highCompletedSeqno=...) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/dcp/active_stream.cc:1184 flags = <optimized out> snapEnd = 6 hcsToSend = {<boost::optional_detail::tc_optional_base<unsigned long>> = {<boost::optional_detail::optional_tag> = {<No data fields>}, m_initialized = false , m_storage = 140508178662144}, <No data fields>} isCkptTypeDisk = <optimized out> snapStart = 0 sendHCS = true #9 0x00007fca98aff434 in ActiveStream::processItems (this=this@entry=0x7fca68345d10, outstandingItemsResult=..., streamMutex=...) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/dcp/active_stream.cc:1100 mutations = {<std::_Deque_base<std::unique_ptr<DcpResponse, std::default_delete<DcpResponse> >, std::allocator<std::unique_ptr<DcpResponse, std::default_delete<DcpResponse> > > >> = { _M_impl = {<std::allocator<std::unique_ptr<DcpResponse, std::default_delete<DcpResponse> > >> = {<__gnu_cxx::new_allocator<std::unique_ptr<DcpResponse, std::default_delete<DcpResponse> > >> = {<No data fields>}, <No data fields>}, _M_map = 0x7fca687e8540, _M_map_size = 8, _M_start = { _M_cur = 0x7fca62ccb200, _M_first = 0x7fca62ccb200, _M_last = 0x7fca62ccb400, _M_node = 0x7fca687e8558}, _M_finish = {_M_cur = 0x7fca62ccb220, _M_first = 0x7fca62ccb200, _M_last = 0x7fca62ccb400, _M_node = 0x7fca687e8558}}}, <No data fields>} #10 0x00007fca98aff843 in ActiveStream::nextCheckpointItemTask (this=this@entry=0x7fca68345d10, streamMutex=...) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/dcp/active_stream.cc:867 res = {checkpointType = Disk, items = {<std::_Vector_base<SingleThreadedRCPtr<Item, Item*, std::default_delete<Item> >, std::allocator<SingleThreadedRCPtr<Item, Item*, std::default_delete<Item> > > >> = { _M_impl = {<std::allocator<SingleThreadedRCPtr<Item, Item*, std::default_delete<Item> > >> = {<__gnu_cxx::new_allocator<SingleThreadedRCPtr<Item, Item*, std::default_delete<Item> > >> = {<No data fields>}, <No data fields>}, _M_start = 0x7fca9c01cf80, _M_finish = 0x7fca9c01cfd0, _M_end_of_storage = 0x7fca9c01d000}}, <No data fields>}, highCompletedSeqno = {<boost::optional_detail::tc_optional_base<unsigned long>> = {<boost::optional_detail::optional_tag> = {<No data fields>}, m_initialized = false , m_storage = 140508178662144}, <No data fields>}} vbucket = {<std::__shared_ptr<VBucket, (__gnu_cxx::_Lock_policy)2>> = {<std::__shared_ptr_access<VBucket, (__gnu_cxx::_Lock_policy)2, false , false >> = {<No data fields>}, _M_ptr = 0x7fca683d6200, _M_refcount = {_M_pi = 0x7fca921219c0}}, <No data fields>} #11 0x00007fca98aff924 in ActiveStream::nextCheckpointItemTask (this=0x7fca68345d10) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/dcp/active_stream.cc:857 lh = {_M_device = @0x7fca68345d78} #12 0x00007fca98b089f9 in ActiveStreamCheckpointProcessorTask::run (this=0x7fca90365890) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/dcp/active_stream_checkpoint_processor_task.cc:56 rh = {<StreamContainer<std::shared_ptr<Stream> >::Iterable<std::_Fwd_list_const_iterator<std::shared_ptr<Stream> > >> = {itr = {_M_node = 0x7fca62aac2e0}, endItr = {_M_node = 0x0}, before = {_M_node = <optimized out>}}, readLock = {_M_pm = 0x7fca9c06c478, _M_owns = <optimized out>}, container = <optimized out>} streams = {<std::__shared_ptr<StreamContainer<std::shared_ptr<Stream> >, (__gnu_cxx::_Lock_policy)2>> = {<std::__shared_ptr_access<StreamContainer<std::shared_ptr<Stream> >, (__gnu_cxx::_Lock_policy)2, false , false >> = {<No data fields>}, _M_ptr = 0x7fca9c06c460, _M_refcount = { _M_pi = 0x7fca9c06c450}}, <No data fields>} iterations = 0 expected = 96 #13 0x00007fca98bb7964 in ExecutorThread::run (this=0x7fca9bf40a40) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/executorthread.cc:187 curTaskDescr = {static npos = 18446744073709551615, _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x7fca92183e80 <Address 0x7fca92183e80 out of bounds>}, _M_string_length = 107, { _M_local_buf = "k\000\000\000\000\000\000\000۔\003\236\312\177\000" , _M_allocated_capacity = 107}} woketime = <optimized out> scheduleOverhead = <optimized out> again = <optimized out> runtime = <optimized out> q = <optimized out> tick = 64 '@' Here we can see that in frame #8, when preparing DCP items from CheckpointManager (in-memory phase) we are attempting to send the high_completed_seqno ( sendHCS = true ) but hcsToSend is not initialised (missing): #8 ActiveStream::snapshot (this=this@entry=0x7fca68345d10, checkpointType=<optimized out>, items=..., highCompletedSeqno=...) at /home/couchbase/jenkins/workspace/couchbase-server-unix/kv_engine/engines/ep/src/dcp/active_stream.cc:1184 flags = <optimized out> snapEnd = 6 hcsToSend = {<boost::optional_detail::tc_optional_base<unsigned long>> = {<boost::optional_detail::optional_tag> = {<No data fields>}, m_initialized = false , m_storage = 140508178662144}, <No data fields>} isCkptTypeDisk = <optimized out> snapStart = 0 sendHCS = true Where did this vbucket (and specifically this Checkpoint - up to seqno:6) come from? vb:334 was previously sent from node .224 to node .226. Node .226 logs (which is running 6.5.0-4922 at this point in time): 2019-12-04T17:49:09.353854-08:00 INFO (default) VBucket: created vb:334 with state:replica initialState:dead lastSeqno:0 persistedRange:{0,0} max_cas:0 uuid:144175708859207 topology:null 2019-12-04T17:49:09.364325-08:00 INFO (default) VBucket::setState: transitioning vb:334 with high seqno:0 from:replica to:replica 2019-12-04T17:49:09.371449-08:00 INFO 73: (default) DCP (Consumer) eq_dcpq:replication:ns_1@172.23.121.224->ns_1@172.23.121.226:default - (vb:334) Attempting to add stream: opaque_:18, start_seqno_:0, end_seqno_:18446744073709551615, vb_uuid:144175708859207, snap_start_seqno_:0, snap_end_seqno_:0, last_seqno: 0, stream_req_value:{ "uid" : "0" } 2019-12-04T17:54:15.377380-08:00 INFO (default) VBucket::setState: transitioning vb:334 with high seqno:6 from:replica to:pending 2019-12-04T17:54:15.759335-08:00 INFO 73: (default) DCP (Consumer) eq_dcpq:replication:ns_1@172.23.121.224->ns_1@172.23.121.226:default - (vb:334) Setting stream to dead state, last_seqno is 6, unAckedBytes is 0, status is The stream closed due to a close stream message 2019-12-04T17:54:15.903546-08:00 INFO 62: (default) DCP (Consumer) eq_dcpq:replication:takeover:883bb157c17a8653edc697d8a806f4f0:ns_1@172.23.121.224->ns_1@172.23.121.226:default:334 - (vb:334) Attempting to add takeover stream: opaque_:1, start_seqno_:6, end_seqno_:18446744073709551615, vb_uuid:11804952313819 0, snap_start_seqno_:6, snap_end_seqno_:6, last_seqno:6, stream_req_value:{ "uid" : "0" } 2019-12-04T17:54:15.998054-08:00 INFO (default) VBucket::setState: transitioning vb:334 with high seqno:6 from:pending to:active Looking at the producer node logs (.224) for the same time we can see the producer's view of this - note the producer is running build 5.0.0-3519 at this time: 2019-12-04T17:49:09.375865-08:00 NOTICE 54: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.23.121.224->ns_1@172.23.121.226:default - (vb 334) Creating stream with start seqno 0 and end seqno 18446744073709551615; requested end seqno was 18446744073709551615 2019-12-04T17:49:09.375991-08:00 NOTICE 54: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.23.121.224->ns_1@172.23.121.226:default - (vb 334) Scheduling backfill from 1 to 6, reschedule flag : False 2019-12-04T17:49:09.376405-08:00 NOTICE 54: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.23.121.224->ns_1@172.23.121.226:default - (vb 334) Sending disk snapshot with start seqno 0 and end seqno 6 2019-12-04T17:49:09.376737-08:00 NOTICE 54: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.23.121.224->ns_1@172.23.121.226:default - (vb 334) Backfill complete, 0 items read from disk, 4 from memory, last seqno read: 6, pendingBackfill : False 2019-12-04T17:49:09.376978-08:00 NOTICE 54: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.23.121.224->ns_1@172.23.121.226:default - ActiveStream::transitionState: (vb 334) Transitioning from backfilling to in-memory To summarise - we crash when trying to send from node .226 a Disk Checkpoint without a HCS. This Checkpoint originally came from node .224 when it was running CB 5.0.0; as such no HCS would have been sent to .226. The fix is to ensure that Disk Checkpoints received from non-SyncReplication capable Producers automatically set the HCS to zero (given before SyncReplication nothing could be Committed).
            owend Daniel Owen made changes -
            Sprint KV-Engine Mad-Hatter GA [ 910 ]
            owend Daniel Owen made changes -
            Rank Ranked higher
            owend Daniel Owen made changes -
            Link This issue relates to MB-37188 [ MB-37188 ]
            owend Daniel Owen made changes -
            Due Date 09/Dec/19
            owend Daniel Owen made changes -
            Summary [Upgrade] failover upgrade failed during rebalance last upgrade node in cluser. [Upgrade] failover upgrade failed during rebalance last upgrade node in cluster.
            owend Daniel Owen made changes -
            Link This issue relates to MB-37187 [ MB-37187 ]
            owend Daniel Owen made changes -
            Link This issue relates to MB-36676 [ MB-36676 ]
            jwalker Jim Walker made changes -
            Assignee Dave Rigby [ drigby ] Thuan Nguyen [ thuan ]
            Resolution Fixed [ 1 ]
            Status In Progress [ 3 ] Resolved [ 5 ]

            Build couchbase-server-6.5.0-4927 contains kv_engine commit 9029b44 with commit message:
            MB-37161: Set HCS for Disk Snapshots received from non-MH Producers

            build-team Couchbase Build Team added a comment - Build couchbase-server-6.5.0-4927 contains kv_engine commit 9029b44 with commit message: MB-37161 : Set HCS for Disk Snapshots received from non-MH Producers

            I will verify it soon.

            thuan Thuan Nguyen added a comment - I will verify it soon.

            Build couchbase-server-6.5.1-6001 contains kv_engine commit 9029b44 with commit message:
            MB-37161: Set HCS for Disk Snapshots received from non-MH Producers

            build-team Couchbase Build Team added a comment - Build couchbase-server-6.5.1-6001 contains kv_engine commit 9029b44 with commit message: MB-37161 : Set HCS for Disk Snapshots received from non-MH Producers
            wayne Wayne Siu made changes -
            Link This issue blocks MB-37192 [ MB-37192 ]
            wayne Wayne Siu made changes -
            Link This issue blocks MB-37192 [ MB-37192 ]
            thuan Thuan Nguyen added a comment -

            Verified on build 6.5.0-4931. This issue is fixed.

            thuan Thuan Nguyen added a comment - Verified on build 6.5.0-4931. This issue is fixed.
            thuan Thuan Nguyen made changes -
            Sprint Status Current Sprint [ 10027 ]
            Status Resolved [ 5 ] Closed [ 6 ]
            drigby Dave Rigby made changes -
            Link This issue causes MB-37204 [ MB-37204 ]
            drigby Dave Rigby added a comment -

            Fix is incomplete - the initial fix incorrectly assumed that DCP features (including SyncReplication) are negotiated before PassiveStream objects are created.

            drigby Dave Rigby added a comment - Fix is incomplete - the initial fix incorrectly assumed that DCP features (including SyncReplication) are negotiated before PassiveStream objects are created.
            drigby Dave Rigby made changes -
            Assignee Thuan Nguyen [ thuan ] Dave Rigby [ drigby ]
            Resolution Fixed [ 1 ]
            Status Closed [ 6 ] Reopened [ 4 ]
            owend Daniel Owen made changes -
            Sprint KV-Engine Mad-Hatter GA [ 910 ] KV-Engine Mad-Hatter GA, KV Sprint 2019-12 [ 910, 939 ]
            owend Daniel Owen made changes -
            Rank Ranked higher
            owend Daniel Owen made changes -
            Rank Ranked higher
            drigby Dave Rigby made changes -
            Link This issue is duplicated by MB-37204 [ MB-37204 ]
            drigby Dave Rigby made changes -
            Link This issue is duplicated by MB-37204 [ MB-37204 ]
            lynn.straus Lynn Straus made changes -
            Due Date 09/Dec/19 11/Dec/19
            arunkumar Arunkumar Senthilnathan (Inactive) made changes -
            Remote Link This issue links to "Page (Couchbase, Inc. Wiki)" [ 18967 ]
            drigby Dave Rigby made changes -
            Assignee Dave Rigby [ drigby ] Thuan Nguyen [ thuan ]
            Resolution Fixed [ 1 ]
            Status Reopened [ 4 ] Resolved [ 5 ]

            Build couchbase-server-6.5.0-4942 contains kv_engine commit cbdd619 with commit message:
            MB-37161: Check if SyncRepl enabled after DCP Control negoatiation complete

            build-team Couchbase Build Team added a comment - Build couchbase-server-6.5.0-4942 contains kv_engine commit cbdd619 with commit message: MB-37161 : Check if SyncRepl enabled after DCP Control negoatiation complete

            Build couchbase-server-7.0.0-1123 contains kv_engine commit 9029b44 with commit message:
            MB-37161: Set HCS for Disk Snapshots received from non-MH Producers

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.0.0-1123 contains kv_engine commit 9029b44 with commit message: MB-37161 : Set HCS for Disk Snapshots received from non-MH Producers

            Build couchbase-server-6.5.1-6006 contains kv_engine commit cbdd619 with commit message:
            MB-37161: Check if SyncRepl enabled after DCP Control negoatiation complete

            build-team Couchbase Build Team added a comment - Build couchbase-server-6.5.1-6006 contains kv_engine commit cbdd619 with commit message: MB-37161 : Check if SyncRepl enabled after DCP Control negoatiation complete

            Build couchbase-server-7.0.0-1133 contains kv_engine commit cbdd619 with commit message:
            MB-37161: Check if SyncRepl enabled after DCP Control negoatiation complete

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.0.0-1133 contains kv_engine commit cbdd619 with commit message: MB-37161 : Check if SyncRepl enabled after DCP Control negoatiation complete

            Build couchbase-server-6.5.1-6015 contains kv_engine commit 39d4604 with commit message:
            MB-37187: Add cluster_test test for MB-37161

            build-team Couchbase Build Team added a comment - Build couchbase-server-6.5.1-6015 contains kv_engine commit 39d4604 with commit message: MB-37187 : Add cluster_test test for MB-37161
            ritam.sharma Ritam Sharma made changes -
            VERIFICATION STEPS Closing on 6.5.0-4947. All upgrade tests have passed - 2i + fts + regular upgrade regression.
            Status Resolved [ 5 ] Closed [ 6 ]

            Build couchbase-server-7.0.0-1144 contains kv_engine commit 39d4604 with commit message:
            MB-37187: Add cluster_test test for MB-37161

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.0.0-1144 contains kv_engine commit 39d4604 with commit message: MB-37187 : Add cluster_test test for MB-37161

            Build couchbase-server-6.6.0-7519 contains kv_engine commit 39d4604 with commit message:
            MB-37187: Add cluster_test test for MB-37161

            build-team Couchbase Build Team added a comment - Build couchbase-server-6.6.0-7519 contains kv_engine commit 39d4604 with commit message: MB-37187 : Add cluster_test test for MB-37161

            People

              thuan Thuan Nguyen
              thuan Thuan Nguyen
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty