Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-45895

[Windows] Online upgrade with swap rebalance failed with reason mover_crashed, wait_seqno_persisted_failed

    XMLWordPrintable

Details

    Description

      Couchbase Server 7.0.0-5016 Enterprise Edition

      After the failure in MB-45893, the new 7.0 node that was added to the cluster - 172.23.107.129 was removed. That rebalance operation went fine.

      Next step was to perform a swap rebalance to take out 6.6.2 node 172.23.107.126 and bring in 172.23.107.129. This swap rebalance operation failed after 10 mins.

      Rebalance exited with reason {mover_crashed,
      {unexpected_exit,
      {'EXIT',<0.27190.4>,
      {{wait_seqno_persisted_failed,"default",469,
      1553,
      [{'ns_1@172.23.107.142',
      {'EXIT',
      {{{badmatch,{error,timeout}},
      [{mc_client_binary,cmd_vocal_recv,5,
      [{file,"src/mc_client_binary.erl"},
      {line,152}]},
      {mc_client_binary,select_bucket,2,
      [{file,"src/mc_client_binary.erl"},
      {line,344}]},
      {ns_memcached_sockets_pool,
      take_socket,2,
      [{file,
      "src/ns_memcached_sockets_pool.erl"},
      {line,40}]},
      {ns_memcached_sockets_pool,
      '-executing_on_socket/3-fun-0-',3,
      [{file,
      "src/ns_memcached_sockets_pool.erl"},
      {line,82}]},
      {async,'-async_init/4-fun-1-',3,
      [{file,"src/async.erl"},
      {line,191}]}]},
      {gen_server,call,
      [{'janitor_agent-default',
      'ns_1@172.23.107.142'},
      {if_rebalance,<0.12682.4>,
      {wait_seqno_persisted,469,1553}},
      infinity]}}}}]},
      [{ns_single_vbucket_mover,
      '-wait_seqno_persisted_many/5-fun-2-',5,
      [{file,"src/ns_single_vbucket_mover.erl"},
      {line,482}]},
      {proc_lib,init_p,3,
      [{file,"proc_lib.erl"},{line,234}]}]}}}}.
      Rebalance Operation Id = 81f43662ad33d73c76a515c82d352a89
      

      Nodes in the cluster : 172.23.107.142 (7.0 node), 172.23.107.126 (outgoing 6.6.2 node), 172.23.107.129 (incoming 7.0 node).

      Initial steps to setup the cluster are the same as MB-45893.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            owend Daniel Owen added a comment -

            This looks like it might be KV - so I will take an initial triage

            owend Daniel Owen added a comment - This looks like it might be KV - so I will take an initial triage
            owend Daniel Owen added a comment - - edited

            Ob node .142 on ns_server.debug.log
            We see 2 rebalance failures
            Rebalance 1

            [user:info,2021-04-23T15:56:02.411-07:00,ns_1@172.23.107.142:<0.15047.8>:ns_orchestrator:idle:773]Starting rebalance, KeepNodes = ['ns_1@172.23.107.126','ns_1@172.23.107.129',
                                             'ns_1@172.23.107.142'], EjectNodes = [], Failed over and being ejected nodes = []; no delta recovery nodes; Operation Id = fda046d63e6f8a8b8014af2288bb470a
            

            Failed here:

            [ns_server:debug,2021-04-23T15:56:13.288-07:00,ns_1@172.23.107.142:leader_activities<0.14526.8>:leader_activities:handle_activity_down:505]Activity terminated with reason {shutdown,
                                             {async_died,
                                              {raised,
                                               {exit,
                                                {pre_rebalance_janitor_run_failed,
                                                 "travel-sample",
                                                 {error,wait_for_memcached_failed,
                                                  ['ns_1@172.23.107.142']}},
                                                [{ns_rebalancer,
                                                  run_janitor_pre_rebalance,1,
                                                  [{file,"src/ns_rebalancer.erl"},
                                                   {line,678}]},
                                                 {lists,foreach,2,
                                                  [{file,"lists.erl"},{line,1338}]},
                                                 {ns_rebalancer,rebalance_body,5,
                                                  [{file,"src/ns_rebalancer.erl"},
                                                   {line,513}]},
                                                 {async,'-async_init/4-fun-1-',3,
                                                  [{file,"src/async.erl"},
                                                   {line,191}]}]}}}}. Activity:
            {activity,<0.16763.34>,#Ref<0.633299211.4211081220.149137>,default,
                      <<"1501ccf09e6318e59de2890e4bf3954c">>,
                      [rebalance],
                      majority,[]}
            

            Rebalance 2

            [user:info,2021-04-23T16:12:48.988-07:00,ns_1@172.23.107.142:<0.11676.1>:ns_orchestrator:idle:773]Starting rebalance, KeepNodes = ['ns_1@172.23.107.129','ns_1@172.23.107.142'], EjectNodes = ['ns_1@172.23.107.126'], Failed over and being ejected nodes = []; no delta recovery nodes; Operation Id = 81f43662ad33d73c76a515c82d352a89
            

            Failed here:

            ** Reason for termination ==
            ** {unexpected_exit,
                   {'EXIT',<0.27190.4>,
                       {{wait_seqno_persisted_failed,"default",469,1553,
                            [{'ns_1@172.23.107.142',
                                 {'EXIT',
                                     {{{badmatch,{error,timeout}},
                                       [{mc_client_binary,cmd_vocal_recv,5,
                                            [{file,"src/mc_client_binary.erl"},
                                             {line,152}]},
                                        {mc_client_binary,select_bucket,2,
                                            [{file,"src/mc_client_binary.erl"},
                                             {line,344}]},
                                        {ns_memcached_sockets_pool,take_socket,2,
                                            [{file,"src/ns_memcached_sockets_pool.erl"},
                                             {line,40}]},
                                        {ns_memcached_sockets_pool,
                                            '-executing_on_socket/3-fun-0-',3,
                                            [{file,"src/ns_memcached_sockets_pool.erl"},
                                             {line,82}]},
                                        {async,'-async_init/4-fun-1-',3,
                                            [{file,"src/async.erl"},{line,191}]}]},
                                      {gen_server,call,
                                          [{'janitor_agent-default','ns_1@172.23.107.142'},
                                           {if_rebalance,<0.12682.4>,
                                               {wait_seqno_persisted,469,1553}},
                                           infinity]}}}}]},
                        [{ns_single_vbucket_mover,'-wait_seqno_persisted_many/5-fun-2-',
                             5,
                             [{file,"src/ns_single_vbucket_mover.erl"},{line,482}]},
                         {proc_lib,init_p,3,[{file,"proc_lib.erl"},{line,234}]}]}}}
            

            owend Daniel Owen added a comment - - edited Ob node .142 on ns_server.debug.log We see 2 rebalance failures Rebalance 1 [user:info,2021-04-23T15:56:02.411-07:00,ns_1@172.23.107.142:<0.15047.8>:ns_orchestrator:idle:773]Starting rebalance, KeepNodes = ['ns_1@172.23.107.126','ns_1@172.23.107.129', 'ns_1@172.23.107.142'], EjectNodes = [], Failed over and being ejected nodes = []; no delta recovery nodes; Operation Id = fda046d63e6f8a8b8014af2288bb470a Failed here: [ns_server:debug,2021-04-23T15:56:13.288-07:00,ns_1@172.23.107.142:leader_activities<0.14526.8>:leader_activities:handle_activity_down:505]Activity terminated with reason {shutdown, {async_died, {raised, {exit, {pre_rebalance_janitor_run_failed, "travel-sample", {error,wait_for_memcached_failed, ['ns_1@172.23.107.142']}}, [{ns_rebalancer, run_janitor_pre_rebalance,1, [{file,"src/ns_rebalancer.erl"}, {line,678}]}, {lists,foreach,2, [{file,"lists.erl"},{line,1338}]}, {ns_rebalancer,rebalance_body,5, [{file,"src/ns_rebalancer.erl"}, {line,513}]}, {async,'-async_init/4-fun-1-',3, [{file,"src/async.erl"}, {line,191}]}]}}}}. Activity: {activity,<0.16763.34>,#Ref<0.633299211.4211081220.149137>,default, <<"1501ccf09e6318e59de2890e4bf3954c">>, [rebalance], majority,[]} Rebalance 2 [user:info,2021-04-23T16:12:48.988-07:00,ns_1@172.23.107.142:<0.11676.1>:ns_orchestrator:idle:773]Starting rebalance, KeepNodes = ['ns_1@172.23.107.129','ns_1@172.23.107.142'], EjectNodes = ['ns_1@172.23.107.126'], Failed over and being ejected nodes = []; no delta recovery nodes; Operation Id = 81f43662ad33d73c76a515c82d352a89 Failed here: ** Reason for termination == ** {unexpected_exit, {'EXIT',<0.27190.4>, {{wait_seqno_persisted_failed,"default",469,1553, [{'ns_1@172.23.107.142', {'EXIT', {{{badmatch,{error,timeout}}, [{mc_client_binary,cmd_vocal_recv,5, [{file,"src/mc_client_binary.erl"}, {line,152}]}, {mc_client_binary,select_bucket,2, [{file,"src/mc_client_binary.erl"}, {line,344}]}, {ns_memcached_sockets_pool,take_socket,2, [{file,"src/ns_memcached_sockets_pool.erl"}, {line,40}]}, {ns_memcached_sockets_pool, '-executing_on_socket/3-fun-0-',3, [{file,"src/ns_memcached_sockets_pool.erl"}, {line,82}]}, {async,'-async_init/4-fun-1-',3, [{file,"src/async.erl"},{line,191}]}]}, {gen_server,call, [{'janitor_agent-default','ns_1@172.23.107.142'}, {if_rebalance,<0.12682.4>, {wait_seqno_persisted,469,1553}}, infinity]}}}}]}, [{ns_single_vbucket_mover,'-wait_seqno_persisted_many/5-fun-2-', 5, [{file,"src/ns_single_vbucket_mover.erl"},{line,482}]}, {proc_lib,init_p,3,[{file,"proc_lib.erl"},{line,234}]}]}}}
            owend Daniel Owen added a comment - - edited

            Analysis of Rebalance Failure 1
            Rebalance only last from 2021-04-23T15:56:02 to 2021-04-23T15:56:13 i.e. 11 seconds.
            During this time don't see anything of interest in the memcached.log.
            Not sure why we see error,wait_for_memcached_failed. Dave Finlay maybe you can help highlight why?
            Looks like a dup of the rebalance failure seen in MB-45893
            See a gap in stats at this time suggesting the system is struggling (but in this case not due to XDCR)

            Analysis of Rebalance Failure 2
            The rebalance started at 2021-04-23T16:12:48 and terminated at 2021-04-23T16:23:32 so was running for ~10 minutes

            [error_logger:error,2021-04-23T16:23:32.697-07:00,ns_1@172.23.107.142:<0.27190.4>:ale_error_logger_handler:do_log:101]
            =========================CRASH REPORT=========================
              crasher:
                initial call: ns_single_vbucket_mover:'-wait_seqno_persisted_many/5-fun-2-'/0
                pid: <0.27190.4>
                registered_name: []
                exception error: {wait_seqno_persisted_failed,"default",469,1553,
            

            Interesting in the memcached.log we see

            2021-04-23T16:21:30.215813-07:00 WARNING 1776: Slow operation. {"cid":"172.23.107.142:61496/0","duration":"10 s","trace":"request=254819401286496:10205175 execute=254819401286496:22 execute=254829606410512:51","command":"SASL_AUTH","peer":"{"ip":"172.23.107.142","port":61496}","bucket":"","packet":{"bodylen":45,"cas":0,"datatype":"raw","extlen":0,"key":"<ud>PLAIN</ud>","keylen":5,"magic":"ClientRequest","opaque":0,"opcode":"SASL_AUTH","vbucket":0}}
             
            < 48 second gap>
             
            2021-04-23T16:22:18.890585-07:00 WARNING 3136: Slow operation. {"cid":"172.23.107.126:64195/e03de","duration":"48 s","trace":"request=254829827705696:48442700 execute=254829827705696:48442700","command":"STAT","peer":"{"ip":"172.23.107.126","port":64195}","bucket":"beer-sample","packet":{"bodylen":15,"cas":0,"datatype":"raw","extlen":0,"key":"<ud>vbucket-details</ud>","keylen":15,"magic":"ClientRequest","opaque":3724742144,"opcode":"STAT","vbucket":0}}
             
            <53 second gap>
             
            2021-04-23T16:23:11.839941-07:00 WARNING 1336: Slow operation. {"cid":"172.23.107.142:57387/e03de","duration":"53 s","trace":"request=254878270969408:52954717 execute=254878270969408:52954717","command":"STAT","peer":"{"ip":"172.23.107.142","port":57387}","bucket":"beer-sample","packet":{"bodylen":15,"cas":0,"datatype":"raw","extlen":0,"key":"<ud>vbucket-details</ud>","keylen":15,"magic":"ClientRequest","opaque":3724742144,"opcode":"STAT","vbucket":0}}
             
            2021-04-23T16:23:11.840663-07:00 WARNING 1656: Slow operation. {"cid":"172.23.107.142:61683/0","duration":"100 s","trace":"request=254831496057008:99730372 execute=254831496057008:22 execute=254931226331488:97","command":"SASL_AUTH","peer":"{"ip":"172.23.107.142","port":61683}","bucket":"","packet":{"bodylen":45,"cas":0,"datatype":"raw","extlen":0,"key":"<ud>PLAIN</ud>","keylen":5,"magic":"ClientRequest","opaque":0,"opcode":"SASL_AUTH","vbucket":0}}
            2021-04-23T16:23:11.841113-07:00 WARNING 3312: Slow operation. {"cid":"172.23.107.142:61691/0","duration":"99 s","trace":"request=254831889792304:99337104 execute=254831889792304:26 execute=254931226821840:74","command":"SASL_AUTH","peer":"{"ip":"172.23.107.142","port":61691}","bucket":"","packet":{"bodylen":45,"cas":0,"datatype":"raw","extlen":0,"key":"<ud>PLAIN</ud>","keylen":5,"magic":"ClientRequest","opaque":0,"opcode":"SASL_AUTH","vbucket":0}}
            2021-04-23T16:23:11.841128-07:00 WARNING 4788: Slow operation. {"cid":"172.23.107.142:61701/0","duration":"99 s","trace":"request=254832016855472:99210076 execute=254832016855472:27 execute=254931226896288:35","command":"SASL_AUTH","peer":"{"ip":"172.23.107.142","port":61701}","bucket":"","packet":{"bodylen":46,"cas":0,"datatype":"raw","extlen":0,"key":"<ud>PLAIN</ud>","keylen":5,"magic":"ClientRequest","opaque":0,"opcode":"SASL_AUTH","vbucket":0}}
            2021-04-23T16:23:11.842429-07:00 WARNING 4784: Slow operation. {"cid":"172.23.107.142:61703/0","duration":"99 s","trace":"request=254832014190400:99214031 execute=254832014190400:18 execute=254931228186592:34","command":"SASL_AUTH","peer":"{"ip":"172.23.107.142","port":61703}","bucket":"","packet":{"bodylen":45,"cas":0,"datatype":"raw","extlen":0,"key":"<ud>PLAIN</ud>","keylen":5,"magic":"ClientRequest","opaque":0,"opcode":"SASL_AUTH","vbucket":0}}
            2021-04-23T16:23:11.842435-07:00 WARNING 2968: Slow operation. {"cid":"172.23.107.142:61702/0","duration":"99 s","trace":"request=254832014883872:99213326 execute=254832014883872:13 execute=254931228157040:52","command":"SASL_AUTH","peer":"{"ip":"172.23.107.142","port":61702}","bucket":"","packet":{"bodylen":45,"cas":0,"datatype":"raw","extlen":0,"key":"<ud>PLAIN</ud>","keylen":5,"magic":"ClientRequest","opaque":0,"opcode":"SASL_AUTH","vbucket":0}}
            2021-04-23T16:23:11.842593-07:00 WARNING 2960: Slow operation. {"cid":"172.23.107.142:61688/0","duration":"99 s","trace":"request=254831872722224:99355669 execute=254831872722224:78 execute=254931228357376:34","command":"SASL_AUTH","peer":"{"ip":"172.23.107.142","port":61688}","bucket":"","packet":{"bodylen":45,"cas":0,"datatype":"raw","extlen":0,"key":"<ud>PLAIN</ud>","keylen":5,"magic":"ClientRequest","opaque":0,"opcode":"SASL_AUTH","vbucket":0}}
            

            Notice that...

            • first slow stat duration is 48 seconds.
            • second slow stat duration is 53 seconds

            48+53 = 101
            And subsequent slow auth commands have a duration of 100s / 99s. It looks like they are by being blocked getting the vbucket-details stats.

            We can see around this time cpu utilisation is at 100%
            and of the 6 available cpu 4 are being used by ns_server.
            Not sure why ..again I think I need your input on this Dave Finlay.

            As a side note I see we are very close to the max of system memory

            owend Daniel Owen added a comment - - edited Analysis of Rebalance Failure 1 Rebalance only last from 2021-04-23T15:56:02 to 2021-04-23T15:56:13 i.e. 11 seconds. During this time don't see anything of interest in the memcached.log. Not sure why we see error,wait_for_memcached_failed . Dave Finlay maybe you can help highlight why? Looks like a dup of the rebalance failure seen in MB-45893 See a gap in stats at this time suggesting the system is struggling (but in this case not due to XDCR) Analysis of Rebalance Failure 2 The rebalance started at 2021-04-23T16:12:48 and terminated at 2021-04-23T16:23:32 so was running for ~10 minutes [error_logger:error,2021-04-23T16:23:32.697-07:00,ns_1@172.23.107.142:<0.27190.4>:ale_error_logger_handler:do_log:101] =========================CRASH REPORT========================= crasher: initial call: ns_single_vbucket_mover:'-wait_seqno_persisted_many/5-fun-2-'/0 pid: <0.27190.4> registered_name: [] exception error: {wait_seqno_persisted_failed,"default",469,1553, Interesting in the memcached.log we see 2021-04-23T16:21:30.215813-07:00 WARNING 1776: Slow operation. {"cid":"172.23.107.142:61496/0","duration":"10 s","trace":"request=254819401286496:10205175 execute=254819401286496:22 execute=254829606410512:51","command":"SASL_AUTH","peer":"{"ip":"172.23.107.142","port":61496}","bucket":"","packet":{"bodylen":45,"cas":0,"datatype":"raw","extlen":0,"key":"<ud>PLAIN</ud>","keylen":5,"magic":"ClientRequest","opaque":0,"opcode":"SASL_AUTH","vbucket":0}}   < 48 second gap>   2021-04-23T16:22:18.890585-07:00 WARNING 3136: Slow operation. {"cid":"172.23.107.126:64195/e03de","duration":"48 s","trace":"request=254829827705696:48442700 execute=254829827705696:48442700","command":"STAT","peer":"{"ip":"172.23.107.126","port":64195}","bucket":"beer-sample","packet":{"bodylen":15,"cas":0,"datatype":"raw","extlen":0,"key":"<ud>vbucket-details</ud>","keylen":15,"magic":"ClientRequest","opaque":3724742144,"opcode":"STAT","vbucket":0}}   <53 second gap>   2021-04-23T16:23:11.839941-07:00 WARNING 1336: Slow operation. {"cid":"172.23.107.142:57387/e03de","duration":"53 s","trace":"request=254878270969408:52954717 execute=254878270969408:52954717","command":"STAT","peer":"{"ip":"172.23.107.142","port":57387}","bucket":"beer-sample","packet":{"bodylen":15,"cas":0,"datatype":"raw","extlen":0,"key":"<ud>vbucket-details</ud>","keylen":15,"magic":"ClientRequest","opaque":3724742144,"opcode":"STAT","vbucket":0}}   2021-04-23T16:23:11.840663-07:00 WARNING 1656: Slow operation. {"cid":"172.23.107.142:61683/0","duration":"100 s","trace":"request=254831496057008:99730372 execute=254831496057008:22 execute=254931226331488:97","command":"SASL_AUTH","peer":"{"ip":"172.23.107.142","port":61683}","bucket":"","packet":{"bodylen":45,"cas":0,"datatype":"raw","extlen":0,"key":"<ud>PLAIN</ud>","keylen":5,"magic":"ClientRequest","opaque":0,"opcode":"SASL_AUTH","vbucket":0}} 2021-04-23T16:23:11.841113-07:00 WARNING 3312: Slow operation. {"cid":"172.23.107.142:61691/0","duration":"99 s","trace":"request=254831889792304:99337104 execute=254831889792304:26 execute=254931226821840:74","command":"SASL_AUTH","peer":"{"ip":"172.23.107.142","port":61691}","bucket":"","packet":{"bodylen":45,"cas":0,"datatype":"raw","extlen":0,"key":"<ud>PLAIN</ud>","keylen":5,"magic":"ClientRequest","opaque":0,"opcode":"SASL_AUTH","vbucket":0}} 2021-04-23T16:23:11.841128-07:00 WARNING 4788: Slow operation. {"cid":"172.23.107.142:61701/0","duration":"99 s","trace":"request=254832016855472:99210076 execute=254832016855472:27 execute=254931226896288:35","command":"SASL_AUTH","peer":"{"ip":"172.23.107.142","port":61701}","bucket":"","packet":{"bodylen":46,"cas":0,"datatype":"raw","extlen":0,"key":"<ud>PLAIN</ud>","keylen":5,"magic":"ClientRequest","opaque":0,"opcode":"SASL_AUTH","vbucket":0}} 2021-04-23T16:23:11.842429-07:00 WARNING 4784: Slow operation. {"cid":"172.23.107.142:61703/0","duration":"99 s","trace":"request=254832014190400:99214031 execute=254832014190400:18 execute=254931228186592:34","command":"SASL_AUTH","peer":"{"ip":"172.23.107.142","port":61703}","bucket":"","packet":{"bodylen":45,"cas":0,"datatype":"raw","extlen":0,"key":"<ud>PLAIN</ud>","keylen":5,"magic":"ClientRequest","opaque":0,"opcode":"SASL_AUTH","vbucket":0}} 2021-04-23T16:23:11.842435-07:00 WARNING 2968: Slow operation. {"cid":"172.23.107.142:61702/0","duration":"99 s","trace":"request=254832014883872:99213326 execute=254832014883872:13 execute=254931228157040:52","command":"SASL_AUTH","peer":"{"ip":"172.23.107.142","port":61702}","bucket":"","packet":{"bodylen":45,"cas":0,"datatype":"raw","extlen":0,"key":"<ud>PLAIN</ud>","keylen":5,"magic":"ClientRequest","opaque":0,"opcode":"SASL_AUTH","vbucket":0}} 2021-04-23T16:23:11.842593-07:00 WARNING 2960: Slow operation. {"cid":"172.23.107.142:61688/0","duration":"99 s","trace":"request=254831872722224:99355669 execute=254831872722224:78 execute=254931228357376:34","command":"SASL_AUTH","peer":"{"ip":"172.23.107.142","port":61688}","bucket":"","packet":{"bodylen":45,"cas":0,"datatype":"raw","extlen":0,"key":"<ud>PLAIN</ud>","keylen":5,"magic":"ClientRequest","opaque":0,"opcode":"SASL_AUTH","vbucket":0}} Notice that... first slow stat duration is 48 seconds. second slow stat duration is 53 seconds 48+53 = 101 And subsequent slow auth commands have a duration of 100s / 99s. It looks like they are by being blocked getting the vbucket-details stats. We can see around this time cpu utilisation is at 100% and of the 6 available cpu 4 are being used by ns_server. Not sure why ..again I think I need your input on this Dave Finlay . As a side note I see we are very close to the max of system memory

            As Daniel pointed out I think the memory usage is the problem. I see goxdcr use up to 4GiB and cbft use up to 2.5GiB on 172.23.107.142 that only has 6GiB or RAM.

            Aliaksey Artamonau Aliaksey Artamonau (Inactive) added a comment - - edited As Daniel pointed out I think the memory usage is the problem. I see goxdcr use up to 4GiB and cbft use up to 2.5GiB on 172.23.107.142 that only has 6GiB or RAM.
            neil.huang Neil Huang added a comment -

            XDCR recently had a memory leak issue resolved as MB-45935.
            7.0.0-5051 should contain this change (https://issues.couchbase.com/browse/MB-45935?focusedCommentId=495915&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-495915)

            Mihir Kamdar do you mind seeing if this works with that build or later?

            neil.huang Neil Huang added a comment - XDCR recently had a memory leak issue resolved as MB-45935 . 7.0.0-5051 should contain this change ( https://issues.couchbase.com/browse/MB-45935?focusedCommentId=495915&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-495915 ) Mihir Kamdar do you mind seeing if this works with that build or later?
            neil.huang Neil Huang added a comment - - edited

            The XDCR logs show the same error that has been seen by MB-45893 + MB-45892:

            2021-04-23T16:34:47.302-07:00 ERRO GOXDCR.PipelineMgr: Failed to start the pipeline 721e2b5f8f84ad3127296726a3122b7d/beer-sample/beer-sample-169757800
            2021-04-23T16:34:47.303-07:00 ERRO GOXDCR.PipelineMgr: Failed to update pipeline 721e2b5f8f84ad3127296726a3122b7d/beer-sample/beer-sample, err=dcp_721e2b5f8f84ad3127296726a3122b7d/beer-sample/beer-sample_172.23.107.142:11210_0 : Not supported
            2021-04-23T16:34:47.303-07:00 ERRO GOXDCR.PipelineMgr: Update of pipeline 721e2b5f8f84ad3127296726a3122b7d/beer-sample/beer-sample failed with errors=dcp_721e2b5f8f84ad3127296726a3122b7d/beer-sample/beer-sample_172.23.107.142:11210_0 : Not supported
            

            Combined MB-45892 that causes repeatedly restarts with MB-45935 that bloats the mem per restart, this caused the XDCR memory bloat that eats up all the memory.

            At the very least, 7.0.0-5038 will unblock this issue. (https://issues.couchbase.com/browse/MB-45893?focusedCommentId=495809&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-495809)

            neil.huang Neil Huang added a comment - - edited The XDCR logs show the same error that has been seen by MB-45893 + MB-45892 : 2021-04-23T16:34:47.302-07:00 ERRO GOXDCR.PipelineMgr: Failed to start the pipeline 721e2b5f8f84ad3127296726a3122b7d/beer-sample/beer-sample-169757800 2021-04-23T16:34:47.303-07:00 ERRO GOXDCR.PipelineMgr: Failed to update pipeline 721e2b5f8f84ad3127296726a3122b7d/beer-sample/beer-sample, err=dcp_721e2b5f8f84ad3127296726a3122b7d/beer-sample/beer-sample_172.23.107.142:11210_0 : Not supported 2021-04-23T16:34:47.303-07:00 ERRO GOXDCR.PipelineMgr: Update of pipeline 721e2b5f8f84ad3127296726a3122b7d/beer-sample/beer-sample failed with errors=dcp_721e2b5f8f84ad3127296726a3122b7d/beer-sample/beer-sample_172.23.107.142:11210_0 : Not supported Combined MB-45892 that causes repeatedly restarts with MB-45935 that bloats the mem per restart, this caused the XDCR memory bloat that eats up all the memory. At the very least, 7.0.0-5038 will unblock this issue. ( https://issues.couchbase.com/browse/MB-45893?focusedCommentId=495809&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-495809 )
            pavithra.mahamani Pavithra Mahamani added a comment - - edited

            Closing..duplicate.

            pavithra.mahamani Pavithra Mahamani added a comment - - edited Closing..duplicate.

            People

              pavithra.mahamani Pavithra Mahamani
              mihir.kamdar Mihir Kamdar (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty