Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-37167

Rebalance multiple kv nodes failed with Some apps are deploying or resuming on some or all Eventing nodes

    XMLWordPrintable

Details

    • Untriaged
    • Yes

    Description

      build: 6.5.0-4917 , last passed on 6.5.0-4908

      • Create 5 node cluster 3 kv , 1 eventing , 1 index and n1ql
      • Deploy SBM handler with timer
      • Start data loading in source bucket
      • Add 2 more kv nodes – failed
      • Verify results in source bucket
      • Remove above added nodes from the cluster

      ./testrunner -i /tmp/testexec.17696.ini -p get-cbcollect-info=True,GROUP=source_bucket_mutation_timers -t eventing.eventing_rebalance.EventingRebalance.test_eventing_rebalance_with_multiple_kv_nodes,doc-per-day=5,dataset=default,nodes_init=5,services_init=kv-kv-kv-eventing-index:n1ql,groups=simple,reset_services=True,handler_code=source_bucket_mutation_with_timers,source_bucket_mutation=True,GROUP=source_bucket_mutation_timers 

      2019-12-04 21:39:56,138] - [rest_client:3331] ERROR - {u'node': u'ns_1@172.23.104.238', u'code': 0, u'text': u'Rebalance exited with reason {service_rebalance_failed,eventing,\n                              {worker_died,\n                               {\'EXIT\',<0.29602.2>,\n                                {{badmatch,\n                                  {error,\n                                   {unknown_error,\n                                    <<"Some apps are deploying or resuming on some or all Eventing nodes">>}}},\n                                 [{service_rebalancer,rebalance_worker,1,\n                                   [{file,"src/service_rebalancer.erl"},\n                                    {line,170}]},\n                                  {proc_lib,init_p,3,\n                                   [{file,"proc_lib.erl"},{line,232}]}]}}}}.\nRebalance Operation Id = 6868fa7d4f59e481b6b7553fb2b8c450', u'shortText': u'message', u'serverTime': u'2019-12-04T21:39:48.893Z', u'module': u'ns_orchestrator', u'tstamp': 1575524388893, u'type': u'critical'}
       
       

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            vikas.chaudhary Vikas Chaudhary created issue -
            jeelan.poola Jeelan Poola made changes -
            Field Original Value New Value
            Assignee Jeelan Poola [ jeelan.poola ] Suraj Naik [ suraj.naik ]

            Vikas Chaudhary Is this consistently reproducible?

            jeelan.poola Jeelan Poola added a comment - Vikas Chaudhary Is this consistently reproducible?
            jeelan.poola Jeelan Poola made changes -
            Assignee Suraj Naik [ suraj.naik ] Satya Nand [ satya.nand ]
            jeelan.poola Jeelan Poola added a comment - - edited

            This just came in today. Not reproducible consistently. We ran the same test on dev-cluster a couple of times. Eventing-consumer is crashing. Root causing still in-progress. This is a must fix for Mad-Hatter.

            jeelan.poola Jeelan Poola added a comment - - edited This just came in today. Not reproducible consistently. We ran the same test on dev-cluster a couple of times. Eventing-consumer is crashing. Root causing still in-progress. This is a must fix for Mad-Hatter.
            suraj.naik Suraj Naik (Inactive) added a comment - - edited

            Vikas Chaudhary Jeelan Poola

            The logs show that rebalance failed due to one of the worker crash. The crash occurred because of a segmentation fault in the libcouchbase. The stack trace observed is

            #0  __GI___libc_free (mem=0x7f046c44ee80) at malloc.c:2941
            #1  0x00007f048c0b1bbb in lcbvb_destroy (conf=0x7f046c052b10)
                at /home/couchbase/jenkins/workspace/couchbase-server-unix/libcouchbase/src/vbucket/vbucket.c:853
            #2  0x00007f048c0e6d64 in decref (this=0x7f046c050ae0)
                at /home/couchbase/jenkins/workspace/couchbase-server-unix/libcouchbase/src/bucketconfig/clconfig.h:546
            #3  update (data=0x7f046c022db0 <Address 0x7f046c022db0 out of bounds>, host=<optimized out>, this=0x7f04840f6e10)
                at /home/couchbase/jenkins/workspace/couchbase-server-unix/libcouchbase/src/bucketconfig/bc_cccp.cc:207
            #4  lcb::clconfig::cccp_update (provider=provider@entry=0x7f04840f6e10, host=<optimized out>, 
                data=0x7f046c022db0 <Address 0x7f046c022db0 out of bounds>)
                at /home/couchbase/jenkins/workspace/couchbase-server-unix/libcouchbase/src/bucketconfig/bc_cccp.cc:175
            #5  0x00007f048c127136 in lcb::Server::handle_nmv (this=this@entry=0x7f0484111550, resinfo=..., oldpkt=oldpkt@entry=0x7f048412cbd0)
                at /home/couchbase/jenkins/workspace/couchbase-server-unix/libcouchbase/src/mcserver/mcserver.cc:151
            #6  0x00007f048c129d49 in try_read (ior=0x7f048411ac28, ctx=0x7f048411abe0, this=0x7f0484111550)
                at /home/couchbase/jenkins/workspace/couchbase-server-unix/libcouchbase/src/mcserver/mcserver.cc:396
            #7  on_read (ctx=0x7f048411abe0) at /home/couchbase/jenkins/workspace/couchbase-server-unix/libcouchbase/src/mcserver/mcserver.cc:460
            #8  0x00007f048c0c3d4c in invoke_read_cb (nb=10969, ctx=0x7f048411abe0)
                at /home/couchbase/jenkins/workspace/couchbase-server-unix/libcouchbase/src/lcbio/ctx.c:278
            #9  E_handler (sock=<optimized out>, which=<optimized out>, arg=0x7f048411abe0)
                at /home/couchbase/jenkins/workspace/couchbase-server-unix/libcouchbase/src/lcbio/ctx.c:307
            #10 0x00007f048c0a9852 in run_loop (io=<optimized out>, is_tick=<optimized out>)
                at /home/couchbase/jenkins/workspace/couchbase-server-unix/libcouchbase/plugins/io/select/plugin-select.c:323
            #11 0x00007f048c13847e in lcb_wait (instance=0x7f04840f64c0) at /home/couchbase/jenkins/workspace/couchbase-server-unix/libcouchbase/src/wait.cc:103
            #12 0x0000000000463596 in RetryWithFixedBackoff<bool (&)(lcb_error_t), lcb_error_t (&)(lcb_st*), lcb_st*&, lcb_error_t, 0> (
                callable=@0x40a460: {lcb_error_t (lcb_st *)} 0x40a460 <lcb_wait@plt>, isRetriable=<optimized out>, initial_delay_milliseconds=200, 
                max_retry_count=5)
                at /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/features/include/retry_util.h:40
            #13 0x0000000000464b01 in timer::TimerStore::GetCounter (this=this@entry=0x7f04840cf240, key=...)
                at /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing-ee/features/src/timer_store.cc:319
            #14 0x00000000004664d0 in timer::TimerStore::SetTimer (this=0x7f04840cf240, timer=...)
                at /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing-ee/features/src/timer_store.cc:47
            #15 0x000000000041f846 in V8Worker::SetTimer (this=this@entry=0x7f0484013700, tinfo=...)
                at /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/v8_consumer/src/v8worker.cc:1148
            #16 0x000000000043dbdf in Timer::CreateTimerImpl (this=0x7f04840c22b0, args=...)
                at /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/v8_consumer/src/timer.cc:98
            #17 0x000000000043e286 in CreateTimer (args=...)
                at /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/v8_consumer/src/timer.cc:142
            #18 0x00007f048d88c239 in v8::internal::FunctionCallbackArguments::Call(v8::internal::CallHandlerInfo*) () from /opt/couchbase/lib/libv8.so
            #19 0x00007f048d88b738 in v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, v8::internal::BuiltinArguments) () from /opt/couchbase/lib/libv8.so
            #20 0x00007f048d88aec6 in v8::internal::Builtin_Impl_HandleApiCall(v8::internal::BuiltinArguments, v8::internal::Isolate*) ()
               from /opt/couchbase/lib/libv8.so
            #21 0x00007f048e0c48ae in Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit () from /opt/couchbase/lib/libv8.so
            #22 0x00002618c730816e in ?? ()
            #23 0x0000028282c825a1 in ?? ()
            #24 0x00002617c3a9d139 in ?? ()
            #25 0x0000000900000000 in ?? ()
            #26 0x0000028282c82681 in ?? ()
            #27 0x0000065d78b0b0e9 in ?? ()
            #28 0x0000364842a7fcd9 in ?? ()
            #29 0x0000065d78b0af49 in ?? ()
            #30 0x0000364842a094a9 in ?? ()
            #31 0x00001f7c96f04679 in ?? ()
            #32 0x0000065d78b0b0e9 in ?? ()
            #33 0x0000364842a7fcd9 in ?? ()
            #34 0x0000065d78b0af49 in ?? ()
            #35 0x0000364842a094a9 in ?? ()
            #36 0x00002617c3a9d139 in ?? ()
            #37 0x0000065d78b0b0e9 in ?? ()
            #38 0x0000065d78b0b0c9 in ?? ()
            #39 0x0000065d78b0b099 in ?? ()
            #40 0x0000065d78b0af49 in ?? ()
            #41 0x000000a300000000 in ?? ()
            #42 0x0000364842a09e91 in ?? ()
            #43 0x0000364842a09441 in ?? ()
            #44 0x00002617c3a82ad9 in ?? ()
            #45 0x00007f0481e324c8 in ?? ()
            #46 0x00007f048e034603 in Builtins_JSEntryTrampoline () from /opt/couchbase/lib/libv8.so
            #47 0x0000065d78b0aed9 in ?? ()
            #48 0x0000065d78b0ac81 in ?? ()
            #49 0x00001f7c96f04679 in ?? ()
            #50 0x0000364842a09441 in ?? ()
            #51 0x0000000000000020 in ?? ()
            #52 0x00007f0481e32530 in ?? ()
            #53 0x00002618c73040de in ?? ()
            #54 0x0000000000000000 in ?? ()
            

            The stack trace seen here is similar to the one seen in the previous bug which was filed on the same test MB-36839. A bug on SDK side was opened to fix this issue CCBC-1118. The test ran fine after the fix but we could not verify it completely as the issue was very intermittent. The issue seems to be not fixed completely as it was reproduced during one of the test runs.

            suraj.naik Suraj Naik (Inactive) added a comment - - edited Vikas Chaudhary Jeelan Poola The logs show that rebalance failed due to one of the worker crash. The crash occurred because of a segmentation fault in the libcouchbase. The stack trace observed is #0 __GI___libc_free (mem=0x7f046c44ee80) at malloc.c:2941 #1 0x00007f048c0b1bbb in lcbvb_destroy (conf=0x7f046c052b10) at /home/couchbase/jenkins/workspace/couchbase-server-unix/libcouchbase/src/vbucket/vbucket.c:853 #2 0x00007f048c0e6d64 in decref (this=0x7f046c050ae0) at /home/couchbase/jenkins/workspace/couchbase-server-unix/libcouchbase/src/bucketconfig/clconfig.h:546 #3 update (data=0x7f046c022db0 <Address 0x7f046c022db0 out of bounds>, host=<optimized out>, this=0x7f04840f6e10) at /home/couchbase/jenkins/workspace/couchbase-server-unix/libcouchbase/src/bucketconfig/bc_cccp.cc:207 #4 lcb::clconfig::cccp_update (provider=provider@entry=0x7f04840f6e10, host=<optimized out>, data=0x7f046c022db0 <Address 0x7f046c022db0 out of bounds>) at /home/couchbase/jenkins/workspace/couchbase-server-unix/libcouchbase/src/bucketconfig/bc_cccp.cc:175 #5 0x00007f048c127136 in lcb::Server::handle_nmv (this=this@entry=0x7f0484111550, resinfo=..., oldpkt=oldpkt@entry=0x7f048412cbd0) at /home/couchbase/jenkins/workspace/couchbase-server-unix/libcouchbase/src/mcserver/mcserver.cc:151 #6 0x00007f048c129d49 in try_read (ior=0x7f048411ac28, ctx=0x7f048411abe0, this=0x7f0484111550) at /home/couchbase/jenkins/workspace/couchbase-server-unix/libcouchbase/src/mcserver/mcserver.cc:396 #7 on_read (ctx=0x7f048411abe0) at /home/couchbase/jenkins/workspace/couchbase-server-unix/libcouchbase/src/mcserver/mcserver.cc:460 #8 0x00007f048c0c3d4c in invoke_read_cb (nb=10969, ctx=0x7f048411abe0) at /home/couchbase/jenkins/workspace/couchbase-server-unix/libcouchbase/src/lcbio/ctx.c:278 #9 E_handler (sock=<optimized out>, which=<optimized out>, arg=0x7f048411abe0) at /home/couchbase/jenkins/workspace/couchbase-server-unix/libcouchbase/src/lcbio/ctx.c:307 #10 0x00007f048c0a9852 in run_loop (io=<optimized out>, is_tick=<optimized out>) at /home/couchbase/jenkins/workspace/couchbase-server-unix/libcouchbase/plugins/io/select/plugin-select.c:323 #11 0x00007f048c13847e in lcb_wait (instance=0x7f04840f64c0) at /home/couchbase/jenkins/workspace/couchbase-server-unix/libcouchbase/src/wait.cc:103 #12 0x0000000000463596 in RetryWithFixedBackoff<bool (&)(lcb_error_t), lcb_error_t (&)(lcb_st*), lcb_st*&, lcb_error_t, 0> ( callable=@0x40a460: {lcb_error_t (lcb_st *)} 0x40a460 <lcb_wait@plt>, isRetriable=<optimized out>, initial_delay_milliseconds=200, max_retry_count=5) at /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/features/include/retry_util.h:40 #13 0x0000000000464b01 in timer::TimerStore::GetCounter (this=this@entry=0x7f04840cf240, key=...) at /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing-ee/features/src/timer_store.cc:319 #14 0x00000000004664d0 in timer::TimerStore::SetTimer (this=0x7f04840cf240, timer=...) at /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing-ee/features/src/timer_store.cc:47 #15 0x000000000041f846 in V8Worker::SetTimer (this=this@entry=0x7f0484013700, tinfo=...) at /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/v8_consumer/src/v8worker.cc:1148 #16 0x000000000043dbdf in Timer::CreateTimerImpl (this=0x7f04840c22b0, args=...) at /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/v8_consumer/src/timer.cc:98 #17 0x000000000043e286 in CreateTimer (args=...) at /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/v8_consumer/src/timer.cc:142 #18 0x00007f048d88c239 in v8::internal::FunctionCallbackArguments::Call(v8::internal::CallHandlerInfo*) () from /opt/couchbase/lib/libv8.so #19 0x00007f048d88b738 in v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, v8::internal::BuiltinArguments) () from /opt/couchbase/lib/libv8.so #20 0x00007f048d88aec6 in v8::internal::Builtin_Impl_HandleApiCall(v8::internal::BuiltinArguments, v8::internal::Isolate*) () from /opt/couchbase/lib/libv8.so #21 0x00007f048e0c48ae in Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit () from /opt/couchbase/lib/libv8.so #22 0x00002618c730816e in ?? () #23 0x0000028282c825a1 in ?? () #24 0x00002617c3a9d139 in ?? () #25 0x0000000900000000 in ?? () #26 0x0000028282c82681 in ?? () #27 0x0000065d78b0b0e9 in ?? () #28 0x0000364842a7fcd9 in ?? () #29 0x0000065d78b0af49 in ?? () #30 0x0000364842a094a9 in ?? () #31 0x00001f7c96f04679 in ?? () #32 0x0000065d78b0b0e9 in ?? () #33 0x0000364842a7fcd9 in ?? () #34 0x0000065d78b0af49 in ?? () #35 0x0000364842a094a9 in ?? () #36 0x00002617c3a9d139 in ?? () #37 0x0000065d78b0b0e9 in ?? () #38 0x0000065d78b0b0c9 in ?? () #39 0x0000065d78b0b099 in ?? () #40 0x0000065d78b0af49 in ?? () #41 0x000000a300000000 in ?? () #42 0x0000364842a09e91 in ?? () #43 0x0000364842a09441 in ?? () #44 0x00002617c3a82ad9 in ?? () #45 0x00007f0481e324c8 in ?? () #46 0x00007f048e034603 in Builtins_JSEntryTrampoline () from /opt/couchbase/lib/libv8.so #47 0x0000065d78b0aed9 in ?? () #48 0x0000065d78b0ac81 in ?? () #49 0x00001f7c96f04679 in ?? () #50 0x0000364842a09441 in ?? () #51 0x0000000000000020 in ?? () #52 0x00007f0481e32530 in ?? () #53 0x00002618c73040de in ?? () #54 0x0000000000000000 in ?? () The stack trace seen here is similar to the one seen in the previous bug which was filed on the same test MB-36839 . A bug on SDK side was opened to fix this issue CCBC-1118 . The test ran fine after the fix but we could not verify it completely as the issue was very intermittent. The issue seems to be not fixed completely as it was reproduced during one of the test runs.

            Suraj Naik please reopen CCBC-1118 if its the root cause

            vikas.chaudhary Vikas Chaudhary added a comment - Suraj Naik  please reopen  CCBC-1118 if its the root cause
            suraj.naik Suraj Naik (Inactive) made changes -
            Link This issue depends on CCBC-1118 [ CCBC-1118 ]
            jeelan.poola Jeelan Poola made changes -
            Component/s clients [ 10042 ]
            Component/s eventing [ 14026 ]
            lynn.straus Lynn Straus made changes -
            Labels functional-test approved-for-mad-hatter functional-test
            lynn.straus Lynn Straus added a comment -

            Linking CCBC-1130.  Per updates in CCBC-1118, this is reporting a new bug which is being tracked by CCBC-1130.

            lynn.straus Lynn Straus added a comment - Linking CCBC-1130 .  Per updates in CCBC-1118 , this is reporting a new bug which is being tracked by  CCBC-1130 .
            lynn.straus Lynn Straus made changes -
            Link This issue depends on CCBC-1130 [ CCBC-1130 ]
            vikas.chaudhary Vikas Chaudhary made changes -
            Description build: 6.5.0-4917 , last passed on 6.5.0-4908
            {noformat}
            ./testrunner -i /tmp/testexec.17696.ini -p get-cbcollect-info=True,GROUP=source_bucket_mutation_timers -t eventing.eventing_rebalance.EventingRebalance.test_eventing_rebalance_with_multiple_kv_nodes,doc-per-day=5,dataset=default,nodes_init=5,services_init=kv-kv-kv-eventing-index:n1ql,groups=simple,reset_services=True,handler_code=source_bucket_mutation_with_timers,source_bucket_mutation=True,GROUP=source_bucket_mutation_timers {noformat}
            {noformat}
            2019-12-04 21:39:56,138] - [rest_client:3331] ERROR - {u'node': u'ns_1@172.23.104.238', u'code': 0, u'text': u'Rebalance exited with reason {service_rebalance_failed,eventing,\n                              {worker_died,\n                               {\'EXIT\',<0.29602.2>,\n                                {{badmatch,\n                                  {error,\n                                   {unknown_error,\n                                    <<"Some apps are deploying or resuming on some or all Eventing nodes">>}}},\n                                 [{service_rebalancer,rebalance_worker,1,\n                                   [{file,"src/service_rebalancer.erl"},\n                                    {line,170}]},\n                                  {proc_lib,init_p,3,\n                                   [{file,"proc_lib.erl"},{line,232}]}]}}}}.\nRebalance Operation Id = 6868fa7d4f59e481b6b7553fb2b8c450', u'shortText': u'message', u'serverTime': u'2019-12-04T21:39:48.893Z', u'module': u'ns_orchestrator', u'tstamp': 1575524388893, u'type': u'critical'}

             {noformat}
            build: 6.5.0-4917 , last passed on 6.5.0-4908
            {noformat}./testrunner -i /tmp/testexec.17696.ini -p get-cbcollect-info=True,GROUP=source_bucket_mutation_timers -t eventing.eventing_rebalance.EventingRebalance.test_eventing_rebalance_with_multiple_kv_nodes,doc-per-day=5,dataset=default,nodes_init=5,services_init=kv-kv-kv-eventing-index:n1ql,groups=simple,reset_services=True,handler_code=source_bucket_mutation_with_timers,source_bucket_mutation=True,GROUP=source_bucket_mutation_timers {noformat}
            {noformat}2019-12-04 21:39:56,138] - [rest_client:3331] ERROR - {u'node': u'ns_1@172.23.104.238', u'code': 0, u'text': u'Rebalance exited with reason {service_rebalance_failed,eventing,\n                              {worker_died,\n                               {\'EXIT\',<0.29602.2>,\n                                {{badmatch,\n                                  {error,\n                                   {unknown_error,\n                                    <<"Some apps are deploying or resuming on some or all Eventing nodes">>}}},\n                                 [{service_rebalancer,rebalance_worker,1,\n                                   [{file,"src/service_rebalancer.erl"},\n                                    {line,170}]},\n                                  {proc_lib,init_p,3,\n                                   [{file,"proc_lib.erl"},{line,232}]}]}}}}.\nRebalance Operation Id = 6868fa7d4f59e481b6b7553fb2b8c450', u'shortText': u'message', u'serverTime': u'2019-12-04T21:39:48.893Z', u'module': u'ns_orchestrator', u'tstamp': 1575524388893, u'type': u'critical'}

             {noformat}
            vikas.chaudhary Vikas Chaudhary made changes -
            Description build: 6.5.0-4917 , last passed on 6.5.0-4908
            {noformat}./testrunner -i /tmp/testexec.17696.ini -p get-cbcollect-info=True,GROUP=source_bucket_mutation_timers -t eventing.eventing_rebalance.EventingRebalance.test_eventing_rebalance_with_multiple_kv_nodes,doc-per-day=5,dataset=default,nodes_init=5,services_init=kv-kv-kv-eventing-index:n1ql,groups=simple,reset_services=True,handler_code=source_bucket_mutation_with_timers,source_bucket_mutation=True,GROUP=source_bucket_mutation_timers {noformat}
            {noformat}2019-12-04 21:39:56,138] - [rest_client:3331] ERROR - {u'node': u'ns_1@172.23.104.238', u'code': 0, u'text': u'Rebalance exited with reason {service_rebalance_failed,eventing,\n                              {worker_died,\n                               {\'EXIT\',<0.29602.2>,\n                                {{badmatch,\n                                  {error,\n                                   {unknown_error,\n                                    <<"Some apps are deploying or resuming on some or all Eventing nodes">>}}},\n                                 [{service_rebalancer,rebalance_worker,1,\n                                   [{file,"src/service_rebalancer.erl"},\n                                    {line,170}]},\n                                  {proc_lib,init_p,3,\n                                   [{file,"proc_lib.erl"},{line,232}]}]}}}}.\nRebalance Operation Id = 6868fa7d4f59e481b6b7553fb2b8c450', u'shortText': u'message', u'serverTime': u'2019-12-04T21:39:48.893Z', u'module': u'ns_orchestrator', u'tstamp': 1575524388893, u'type': u'critical'}

             {noformat}
            build: 6.5.0-4917 , last passed on 6.5.0-4908
             * Create 5 node cluster 3 kv , 1 eventing , 1 index and n1ql
             * Deploy SBM handler with timer
             * Start data loading in source bucket
             * Add 2 more kv nodes
             * Verify results in source bucket
             * Remove above added nodes from the cluster 

            {noformat}./testrunner -i /tmp/testexec.17696.ini -p get-cbcollect-info=True,GROUP=source_bucket_mutation_timers -t eventing.eventing_rebalance.EventingRebalance.test_eventing_rebalance_with_multiple_kv_nodes,doc-per-day=5,dataset=default,nodes_init=5,services_init=kv-kv-kv-eventing-index:n1ql,groups=simple,reset_services=True,handler_code=source_bucket_mutation_with_timers,source_bucket_mutation=True,GROUP=source_bucket_mutation_timers {noformat}
            {noformat}2019-12-04 21:39:56,138] - [rest_client:3331] ERROR - {u'node': u'ns_1@172.23.104.238', u'code': 0, u'text': u'Rebalance exited with reason {service_rebalance_failed,eventing,\n                              {worker_died,\n                               {\'EXIT\',<0.29602.2>,\n                                {{badmatch,\n                                  {error,\n                                   {unknown_error,\n                                    <<"Some apps are deploying or resuming on some or all Eventing nodes">>}}},\n                                 [{service_rebalancer,rebalance_worker,1,\n                                   [{file,"src/service_rebalancer.erl"},\n                                    {line,170}]},\n                                  {proc_lib,init_p,3,\n                                   [{file,"proc_lib.erl"},{line,232}]}]}}}}.\nRebalance Operation Id = 6868fa7d4f59e481b6b7553fb2b8c450', u'shortText': u'message', u'serverTime': u'2019-12-04T21:39:48.893Z', u'module': u'ns_orchestrator', u'tstamp': 1575524388893, u'type': u'critical'}

             {noformat}
            vikas.chaudhary Vikas Chaudhary made changes -
            Description build: 6.5.0-4917 , last passed on 6.5.0-4908
             * Create 5 node cluster 3 kv , 1 eventing , 1 index and n1ql
             * Deploy SBM handler with timer
             * Start data loading in source bucket
             * Add 2 more kv nodes
             * Verify results in source bucket
             * Remove above added nodes from the cluster 

            {noformat}./testrunner -i /tmp/testexec.17696.ini -p get-cbcollect-info=True,GROUP=source_bucket_mutation_timers -t eventing.eventing_rebalance.EventingRebalance.test_eventing_rebalance_with_multiple_kv_nodes,doc-per-day=5,dataset=default,nodes_init=5,services_init=kv-kv-kv-eventing-index:n1ql,groups=simple,reset_services=True,handler_code=source_bucket_mutation_with_timers,source_bucket_mutation=True,GROUP=source_bucket_mutation_timers {noformat}
            {noformat}2019-12-04 21:39:56,138] - [rest_client:3331] ERROR - {u'node': u'ns_1@172.23.104.238', u'code': 0, u'text': u'Rebalance exited with reason {service_rebalance_failed,eventing,\n                              {worker_died,\n                               {\'EXIT\',<0.29602.2>,\n                                {{badmatch,\n                                  {error,\n                                   {unknown_error,\n                                    <<"Some apps are deploying or resuming on some or all Eventing nodes">>}}},\n                                 [{service_rebalancer,rebalance_worker,1,\n                                   [{file,"src/service_rebalancer.erl"},\n                                    {line,170}]},\n                                  {proc_lib,init_p,3,\n                                   [{file,"proc_lib.erl"},{line,232}]}]}}}}.\nRebalance Operation Id = 6868fa7d4f59e481b6b7553fb2b8c450', u'shortText': u'message', u'serverTime': u'2019-12-04T21:39:48.893Z', u'module': u'ns_orchestrator', u'tstamp': 1575524388893, u'type': u'critical'}

             {noformat}
            build: 6.5.0-4917 , last passed on 6.5.0-4908
             * Create 5 node cluster 3 kv , 1 eventing , 1 index and n1ql
             * Deploy SBM handler with timer
             * Start data loading in source bucket
             * Add 2 more kv nodes
             * Verify results in source bucket
             * {color:#de350b}Remove above added nodes from the cluster – failed{color}

            {noformat}./testrunner -i /tmp/testexec.17696.ini -p get-cbcollect-info=True,GROUP=source_bucket_mutation_timers -t eventing.eventing_rebalance.EventingRebalance.test_eventing_rebalance_with_multiple_kv_nodes,doc-per-day=5,dataset=default,nodes_init=5,services_init=kv-kv-kv-eventing-index:n1ql,groups=simple,reset_services=True,handler_code=source_bucket_mutation_with_timers,source_bucket_mutation=True,GROUP=source_bucket_mutation_timers {noformat}
            {noformat}2019-12-04 21:39:56,138] - [rest_client:3331] ERROR - {u'node': u'ns_1@172.23.104.238', u'code': 0, u'text': u'Rebalance exited with reason {service_rebalance_failed,eventing,\n                              {worker_died,\n                               {\'EXIT\',<0.29602.2>,\n                                {{badmatch,\n                                  {error,\n                                   {unknown_error,\n                                    <<"Some apps are deploying or resuming on some or all Eventing nodes">>}}},\n                                 [{service_rebalancer,rebalance_worker,1,\n                                   [{file,"src/service_rebalancer.erl"},\n                                    {line,170}]},\n                                  {proc_lib,init_p,3,\n                                   [{file,"proc_lib.erl"},{line,232}]}]}}}}.\nRebalance Operation Id = 6868fa7d4f59e481b6b7553fb2b8c450', u'shortText': u'message', u'serverTime': u'2019-12-04T21:39:48.893Z', u'module': u'ns_orchestrator', u'tstamp': 1575524388893, u'type': u'critical'}

             {noformat}
            vikas.chaudhary Vikas Chaudhary made changes -
            Description build: 6.5.0-4917 , last passed on 6.5.0-4908
             * Create 5 node cluster 3 kv , 1 eventing , 1 index and n1ql
             * Deploy SBM handler with timer
             * Start data loading in source bucket
             * Add 2 more kv nodes
             * Verify results in source bucket
             * {color:#de350b}Remove above added nodes from the cluster – failed{color}

            {noformat}./testrunner -i /tmp/testexec.17696.ini -p get-cbcollect-info=True,GROUP=source_bucket_mutation_timers -t eventing.eventing_rebalance.EventingRebalance.test_eventing_rebalance_with_multiple_kv_nodes,doc-per-day=5,dataset=default,nodes_init=5,services_init=kv-kv-kv-eventing-index:n1ql,groups=simple,reset_services=True,handler_code=source_bucket_mutation_with_timers,source_bucket_mutation=True,GROUP=source_bucket_mutation_timers {noformat}
            {noformat}2019-12-04 21:39:56,138] - [rest_client:3331] ERROR - {u'node': u'ns_1@172.23.104.238', u'code': 0, u'text': u'Rebalance exited with reason {service_rebalance_failed,eventing,\n                              {worker_died,\n                               {\'EXIT\',<0.29602.2>,\n                                {{badmatch,\n                                  {error,\n                                   {unknown_error,\n                                    <<"Some apps are deploying or resuming on some or all Eventing nodes">>}}},\n                                 [{service_rebalancer,rebalance_worker,1,\n                                   [{file,"src/service_rebalancer.erl"},\n                                    {line,170}]},\n                                  {proc_lib,init_p,3,\n                                   [{file,"proc_lib.erl"},{line,232}]}]}}}}.\nRebalance Operation Id = 6868fa7d4f59e481b6b7553fb2b8c450', u'shortText': u'message', u'serverTime': u'2019-12-04T21:39:48.893Z', u'module': u'ns_orchestrator', u'tstamp': 1575524388893, u'type': u'critical'}

             {noformat}
            build: 6.5.0-4917 , last passed on 6.5.0-4908
             * Create 5 node cluster 3 kv , 1 eventing , 1 index and n1ql
             * Deploy SBM handler with timer
             * Start data loading in source bucket
             * {color:#de350b}Add 2 more kv nodes – failed{color}
             * Verify results in source bucket
             * Remove above added nodes from the cluster – failed

            {noformat}./testrunner -i /tmp/testexec.17696.ini -p get-cbcollect-info=True,GROUP=source_bucket_mutation_timers -t eventing.eventing_rebalance.EventingRebalance.test_eventing_rebalance_with_multiple_kv_nodes,doc-per-day=5,dataset=default,nodes_init=5,services_init=kv-kv-kv-eventing-index:n1ql,groups=simple,reset_services=True,handler_code=source_bucket_mutation_with_timers,source_bucket_mutation=True,GROUP=source_bucket_mutation_timers {noformat}
            {noformat}2019-12-04 21:39:56,138] - [rest_client:3331] ERROR - {u'node': u'ns_1@172.23.104.238', u'code': 0, u'text': u'Rebalance exited with reason {service_rebalance_failed,eventing,\n                              {worker_died,\n                               {\'EXIT\',<0.29602.2>,\n                                {{badmatch,\n                                  {error,\n                                   {unknown_error,\n                                    <<"Some apps are deploying or resuming on some or all Eventing nodes">>}}},\n                                 [{service_rebalancer,rebalance_worker,1,\n                                   [{file,"src/service_rebalancer.erl"},\n                                    {line,170}]},\n                                  {proc_lib,init_p,3,\n                                   [{file,"proc_lib.erl"},{line,232}]}]}}}}.\nRebalance Operation Id = 6868fa7d4f59e481b6b7553fb2b8c450', u'shortText': u'message', u'serverTime': u'2019-12-04T21:39:48.893Z', u'module': u'ns_orchestrator', u'tstamp': 1575524388893, u'type': u'critical'}

             {noformat}
            vikas.chaudhary Vikas Chaudhary made changes -
            Description build: 6.5.0-4917 , last passed on 6.5.0-4908
             * Create 5 node cluster 3 kv , 1 eventing , 1 index and n1ql
             * Deploy SBM handler with timer
             * Start data loading in source bucket
             * {color:#de350b}Add 2 more kv nodes – failed{color}
             * Verify results in source bucket
             * Remove above added nodes from the cluster – failed

            {noformat}./testrunner -i /tmp/testexec.17696.ini -p get-cbcollect-info=True,GROUP=source_bucket_mutation_timers -t eventing.eventing_rebalance.EventingRebalance.test_eventing_rebalance_with_multiple_kv_nodes,doc-per-day=5,dataset=default,nodes_init=5,services_init=kv-kv-kv-eventing-index:n1ql,groups=simple,reset_services=True,handler_code=source_bucket_mutation_with_timers,source_bucket_mutation=True,GROUP=source_bucket_mutation_timers {noformat}
            {noformat}2019-12-04 21:39:56,138] - [rest_client:3331] ERROR - {u'node': u'ns_1@172.23.104.238', u'code': 0, u'text': u'Rebalance exited with reason {service_rebalance_failed,eventing,\n                              {worker_died,\n                               {\'EXIT\',<0.29602.2>,\n                                {{badmatch,\n                                  {error,\n                                   {unknown_error,\n                                    <<"Some apps are deploying or resuming on some or all Eventing nodes">>}}},\n                                 [{service_rebalancer,rebalance_worker,1,\n                                   [{file,"src/service_rebalancer.erl"},\n                                    {line,170}]},\n                                  {proc_lib,init_p,3,\n                                   [{file,"proc_lib.erl"},{line,232}]}]}}}}.\nRebalance Operation Id = 6868fa7d4f59e481b6b7553fb2b8c450', u'shortText': u'message', u'serverTime': u'2019-12-04T21:39:48.893Z', u'module': u'ns_orchestrator', u'tstamp': 1575524388893, u'type': u'critical'}

             {noformat}
            build: 6.5.0-4917 , last passed on 6.5.0-4908
             * Create 5 node cluster 3 kv , 1 eventing , 1 index and n1ql
             * Deploy SBM handler with timer
             * Start data loading in source bucket
             * {color:#de350b}Add 2 more kv nodes – failed{color}
             * Verify results in source bucket
             * Remove above added nodes from the cluster

            {noformat}./testrunner -i /tmp/testexec.17696.ini -p get-cbcollect-info=True,GROUP=source_bucket_mutation_timers -t eventing.eventing_rebalance.EventingRebalance.test_eventing_rebalance_with_multiple_kv_nodes,doc-per-day=5,dataset=default,nodes_init=5,services_init=kv-kv-kv-eventing-index:n1ql,groups=simple,reset_services=True,handler_code=source_bucket_mutation_with_timers,source_bucket_mutation=True,GROUP=source_bucket_mutation_timers {noformat}
            {noformat}2019-12-04 21:39:56,138] - [rest_client:3331] ERROR - {u'node': u'ns_1@172.23.104.238', u'code': 0, u'text': u'Rebalance exited with reason {service_rebalance_failed,eventing,\n                              {worker_died,\n                               {\'EXIT\',<0.29602.2>,\n                                {{badmatch,\n                                  {error,\n                                   {unknown_error,\n                                    <<"Some apps are deploying or resuming on some or all Eventing nodes">>}}},\n                                 [{service_rebalancer,rebalance_worker,1,\n                                   [{file,"src/service_rebalancer.erl"},\n                                    {line,170}]},\n                                  {proc_lib,init_p,3,\n                                   [{file,"proc_lib.erl"},{line,232}]}]}}}}.\nRebalance Operation Id = 6868fa7d4f59e481b6b7553fb2b8c450', u'shortText': u'message', u'serverTime': u'2019-12-04T21:39:48.893Z', u'module': u'ns_orchestrator', u'tstamp': 1575524388893, u'type': u'critical'}

             {noformat}
            lynn.straus Lynn Straus made changes -
            Link This issue blocks MB-36676 [ MB-36676 ]

            Suraj Naik, I've fixed several memory issues in this commit, an these might be related to the issue. Could you re-run the test with the fix applied to libcouchbase? http://review.couchbase.org/c/119163

            avsej Sergey Avseyev added a comment - Suraj Naik , I've fixed several memory issues in this commit, an these might be related to the issue. Could you re-run the test with the fix applied to libcouchbase? http://review.couchbase.org/c/119163

            It appears that MB-37167 and MB-37197 may be related. See updates in CCBC-1130, which tracks the changes being made.

            ingenthr Matt Ingenthron added a comment - It appears that MB-37167 and MB-37197 may be related. See updates in CCBC-1130 , which tracks the changes being made.
            lynn.straus Lynn Straus made changes -
            Due Date 12/Dec/19
            lynn.straus Lynn Straus added a comment -

            Multiple changes coming in this area with CCBC-1130 which has ETA of Dec 12.  Setting ETA for this dependent ticket to Dec 12.

            lynn.straus Lynn Straus added a comment - Multiple changes coming in this area with CCBC-1130 which has ETA of Dec 12.  Setting ETA for this dependent ticket to Dec 12.

            See comments on MB-37197 about status update. Currently both of these are in the same area and could possibly be the same. Continuing to work to isolate. Updating the "due" date accordingly.

            ingenthr Matt Ingenthron added a comment - See comments on MB-37197 about status update. Currently both of these are in the same area and could possibly be the same. Continuing to work to isolate. Updating the "due" date accordingly.
            ingenthr Matt Ingenthron made changes -
            Due Date 12/Dec/19 16/Dec/19
            dfinlay Dave Finlay made changes -
            Due Date 16/Dec/19 17/Dec/19

            Suraj Naik, could you re-run the test in the subject? Or it is the same tests that been running for MB-37197?

            avsej Sergey Avseyev added a comment - Suraj Naik , could you re-run the test in the subject? Or it is the same tests that been running for MB-37197 ?

            Sergey Avseyev,

            It is the same test.

            suraj.naik Suraj Naik (Inactive) added a comment - Sergey Avseyev , It is the same test.
            jeelan.poola Jeelan Poola made changes -
            Resolution Fixed [ 1 ]
            Status Open [ 1 ] Resolved [ 5 ]

            Not seen on 6.5.0-4959

            vikas.chaudhary Vikas Chaudhary added a comment - Not seen on 6.5.0-4959
            vikas.chaudhary Vikas Chaudhary made changes -
            Status Resolved [ 5 ] Closed [ 6 ]
            arunkumar Arunkumar Senthilnathan (Inactive) made changes -
            Link This issue relates to SDKQE-1914 [ SDKQE-1914 ]

            People

              satya.nand Satya Nand (Inactive)
              vikas.chaudhary Vikas Chaudhary
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty