Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-37633

child process exited with status 137 with n1ql handler

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 6.0.4
    • Fix Version/s: 6.0.4
    • Component/s: eventing
    • Labels:

      Description

      Build: 6.0.4-3070 , passed on 6.0.2

      While running deployment undeployment in loop seeing eventing exited with 137

      Test

      ./testrunner -i vikas-nodes.ini -t eventing.eventing_lifecycle.EventingLifeCycle.test_function_deploy_undeploy_in_a_loop_for_n1ql_operations,nodes_init=4,services_init=kv-eventing-index-n1ql,dataset=default,groups=simple,reset_services=True,skip_cleanup=True 

      Service 'eventing' exited with status 137. Restarting. Messages:
      2020-01-22T21:00:56.457-08:00 [Info] Consumer::startDcp [worker_Function_958132025_test_function_deploy_undeploy_in_a_loop_for_n1ql_operations_2:43419:5128] vb: 840 checkpoint blob prexisted, UUID: assigned worker:
      2020-01-22T21:00:56.460-08:00 [Info] Consumer::checkIfAlreadyEnqueued [worker_Function_958132025_test_function_deploy_undeploy_in_a_loop_for_n1ql_operations_2:43419:5128] vb: 840 not enqueued
      2020-01-22T21:00:56.460-08:00 [Info] Consumer::addToEnqueueMap [worker_Function_958132025_test_function_deploy_undeploy_in_a_loop_for_n1ql_operations_2:43419:5128] vb: 840 enqueuing
      2020-01-22T21:00:56.460-08:00 [Info] Consumer::startDcp [worker_Function_958132025_test_function_deploy_undeploy_in_a_loop_for_n1ql_operations_2:43419:5128] vb: 840 Sending streamRequestInfo size: 154
      2020-01-22T21:00:57.968-08:00 [Info] Consumer::startDcp [worker_Function_958132025_test_function_deploy_undeploy_in_a_loop_for_n1ql_operations_2:43419:5128] vb: 841 vbuuid: 54716609244456 flog: [[54716609244456 0]] going to start dcp stream
      [goport(/opt/couchbase/bin/eventing-producer)] 2020/01/22 21:00:58 child process exited with status 137
      
      

      Eventing logs

      2020-01-22T21:02:43.997-08:00 [Warn] client::Serve [worker_Function_958132025_test_function_deploy_undeploy_in_a_loop_for_n1ql_operations_2:45169:5860] Failed to read from stdout pipe, err: EOF
      2020-01-22T21:02:43.998-08:00 [Info] eventing-consumer [worker_Function_958132025_test_function_deploy_undeploy_in_a_loop_for_n1ql_operations_2:45169:5860] Received signal 11 <unknown> 000000000000
      2020-01-22T21:02:43.998-08:00 [Info] eventing-consumer [worker_Function_958132025_test_function_deploy_undeploy_in_a_loop_for_n1ql_operations_2:45169:5860]
      2020-01-22T21:02:43.998-08:00 [Info] eventing-consumer [worker_Function_958132025_test_function_deploy_undeploy_in_a_loop_for_n1ql_operations_2:45169:5860] ==== C stack trace ===============================
      2020-01-22T21:02:43.998-08:00 [Info] eventing-consumer [worker_Function_958132025_test_function_deploy_undeploy_in_a_loop_for_n1ql_operations_2:45169:5860]
      2020-01-22T21:02:43.999-08:00 [Info] eventing-consumer [worker_Function_958132025_test_function_deploy_undeploy_in_a_loop_for_n1ql_operations_2:45169:5860]  [0x7f2f1b1f6734]
      2020-01-22T21:02:43.999-08:00 [Info] eventing-consumer [worker_Function_958132025_test_function_deploy_undeploy_in_a_loop_for_n1ql_operations_2:45169:5860]  [0x7f2f156863d0]
      2020-01-22T21:02:43.999-08:00 [Info] eventing-consumer [worker_Function_958132025_test_function_deploy_undeploy_in_a_loop_for_n1ql_operations_2:45169:5860]  [0x7f2f15eceffb]
      2020-01-22T21:02:43.999-08:00 [Info] eventing-consumer [worker_Function_958132025_test_function_deploy_undeploy_in_a_loop_for_n1ql_operations_2:45169:5860]  [0x7f2f172bd8c8]
      2020-01-22T21:02:43.999-08:00 [Info] eventing-consumer [worker_Function_958132025_test_function_deploy_undeploy_in_a_loop_for_n1ql_operations_2:45169:5860]  [0x7f2f172bd92e]
      2020-01-22T21:02:43.999-08:00 [Info] eventing-consumer [worker_Function_958132025_test_function_deploy_undeploy_in_a_loop_for_n1ql_operations_2:45169:5860]  [0x00000045e2c8]
      2020-01-22T21:02:44.000-08:00 [Info] eventing-consumer [worker_Function_958132025_test_function_deploy_undeploy_in_a_loop_for_n1ql_operations_2:45169:5860]  [0x00000045f085]
      2020-01-22T21:02:44.000-08:00 [Info] eventing-consumer [worker_Function_958132025_test_function_deploy_undeploy_in_a_loop_for_n1ql_operations_2:45169:5860]  [0x00000045ad15]
      2020-01-22T21:02:44.000-08:00 [Info] eventing-consumer [worker_Function_958132025_test_function_deploy_undeploy_in_a_loop_for_n1ql_operations_2:45169:5860]  [0x7f2f1a366bf0]
      2020-01-22T21:02:44.000-08:00 [Info] eventing-consumer [worker_Function_958132025_test_function_deploy_undeploy_in_a_loop_for_n1ql_operations_2:45169:5860]  [0x7f2f1a41352c]
      2020-01-22T21:02:44.000-08:00 [Info] eventing-consumer [worker_Function_958132025_test_function_deploy_undeploy_in_a_loop_for_n1ql_operations_2:45169:5860]  [0x7f2f1a412a6f]
      2020-01-22T21:02:44.000-08:00 [Info] eventing-consumer [worker_Function_958132025_test_function_deploy_undeploy_in_a_loop_for_n1ql_operations_2:45169:5860]  [0x3e3aa3f843fd]
      2020-01-22T21:02:44.004-08:00 [Info] eventing-consumer [worker_Function_958132025_test_function_deploy_undeploy_in_a_loop_for_n1ql_operations_2:45169:5860] [end of stack trace]
      2020-01-22T21:02:44.004-08:00 [Warn] client::Serve [worker_Function_958132025_test_function_deploy_undeploy_in_a_loop_for_n1ql_operations_2:45169:5860] Failed to read from stderr pipe, err: read |0: file already closed
      2020-01-22T21:02:44.005-08:00 [Warn] client::Serve [worker_Function_958132025_test_function_deploy_undeploy_in_a_loop_for_n1ql_operations_2:45169:5860] Exiting c++ worker with error: signal: segmentation fault 

        Attachments

          Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

            Activity

            Hide
            vikas.chaudhary Vikas Chaudhary added a comment -

            Seeing many other rebalance cases failing due to same error

            ./testrunner -i /tmp/testexec.15188.ini -p get-cbcollect-info=False,GROUP=n1ql_op_without_timers,get-cbcollect-info=True -t eventing.eventing_rebalance.EventingRebalance.test_stop_start_eventing_rebalance,nodes_init=6,services_init=kv-kv-eventing-eventing-eventing-index:n1ql,dataset=default,groups=simple,reset_services=True,doc-per-day=20,enable_failover=True,handler_code=n1ql_op_without_timers,replicas=1,GROUP=n1ql_op_without_timers
             

            ./testrunner -i /tmp/testexec.15188.ini -p get-cbcollect-info=False,GROUP=n1ql_op_without_timers,get-cbcollect-info=True -t eventing.eventing_rebalance.EventingRebalance.test_killing_eventing_processes_during_eventing_rebalance,doc-per-day=20,dataset=default,nodes_init=5,services_init=kv-kv-eventing-eventing-index:n1ql,groups=simple,reset_services=True,handler_code=n1ql_op_without_timers,replicas=1,GROUP=n1ql_op_without_timers 
            
            

            logs attached 

            Show
            vikas.chaudhary Vikas Chaudhary added a comment - Seeing many other rebalance cases failing due to same error ./testrunner -i /tmp/testexec.15188.ini -p get-cbcollect-info=False,GROUP=n1ql_op_without_timers,get-cbcollect-info=True -t eventing.eventing_rebalance.EventingRebalance.test_stop_start_eventing_rebalance,nodes_init=6,services_init=kv-kv-eventing-eventing-eventing-index:n1ql,dataset=default,groups=simple,reset_services=True,doc-per-day=20,enable_failover=True,handler_code=n1ql_op_without_timers,replicas=1,GROUP=n1ql_op_without_timers ./testrunner -i /tmp/testexec.15188.ini -p get-cbcollect-info=False,GROUP=n1ql_op_without_timers,get-cbcollect-info=True -t eventing.eventing_rebalance.EventingRebalance.test_killing_eventing_processes_during_eventing_rebalance,doc-per-day=20,dataset=default,nodes_init=5,services_init=kv-kv-eventing-eventing-index:n1ql,groups=simple,reset_services=True,handler_code=n1ql_op_without_timers,replicas=1,GROUP=n1ql_op_without_timers logs attached 
            Hide
            Gautham.Banasandra Gautham Banasandra (Inactive) added a comment -

            Vikas Chaudhary in the description you've mentioned that this test passed on 6.0.2. Do you happen to know the build in 6.0.3 where this test passed?

            Show
            Gautham.Banasandra Gautham Banasandra (Inactive) added a comment - Vikas Chaudhary in the description you've mentioned that this test passed on 6.0.2. Do you happen to know the build in 6.0.3 where this test passed?
            Hide
            vikas.chaudhary Vikas Chaudhary added a comment -

            Gautham Banasandra As n1ql was not GA , we were not running all the tests during 6.0.3.

            Show
            vikas.chaudhary Vikas Chaudhary added a comment - Gautham Banasandra  As n1ql was not GA , we were not running all the tests during 6.0.3.
            Hide
            jeelan.poola Jeelan Poola added a comment -

            Removing Regression Yes label. QE confirmed the test was not run against 6.0.3. So checking if the issue existed in 6.0.3 as well.

            Show
            jeelan.poola Jeelan Poola added a comment - Removing Regression Yes label. QE confirmed the test was not run against 6.0.3. So checking if the issue existed in 6.0.3 as well.
            Hide
            Gautham.Banasandra Gautham Banasandra (Inactive) added a comment -

            I just ran this test on the 6.0.3 release build. It fails - http://qa.sc.couchbase.com/job/dev_testbed_blr3/263/

            Show
            Gautham.Banasandra Gautham Banasandra (Inactive) added a comment - I just ran this test on the 6.0.3 release build. It fails - http://qa.sc.couchbase.com/job/dev_testbed_blr3/263/
            Hide
            Gautham.Banasandra Gautham Banasandra (Inactive) added a comment -

            We made a change in 6.0.3 to run all N1QL queries in eventing as prepared queries -https://github.com/couchbase/eventing/commit/1ba3c971e0991b7e6233a8bb6a23fcdbce9e3197 by setting the flag *LCB_CMDN1QL_F_PREPCACHE *.
            This is causing a SEGFAULT in eventing-consumer for a test that does deploy and undeploy in a loop of a Function involving a N1QL query.
            As Vikas Chaudhary has mentioned above, this test wasn't run in 6.0.3 and thus it exists in 6.0.4 as well. Since we haven't yet root caused why this SEGFAULT happens when LCB_CMDN1QL_F_PREPCACHE is set, we are disabling this flag by default - https://github.com/couchbase/eventing/commit/6daf6338e56953bac41f3e5e5f3468ec3f31bc35

            Show
            Gautham.Banasandra Gautham Banasandra (Inactive) added a comment - We made a change in 6.0.3 to run all N1QL queries in eventing as prepared queries - https://github.com/couchbase/eventing/commit/1ba3c971e0991b7e6233a8bb6a23fcdbce9e3197 by setting the flag *LCB_CMDN1QL_F_PREPCACHE *. This is causing a SEGFAULT in eventing-consumer for a test that does deploy and undeploy in a loop of a Function involving a N1QL query. As Vikas Chaudhary has mentioned above, this test wasn't run in 6.0.3 and thus it exists in 6.0.4 as well. Since we haven't yet root caused why this SEGFAULT happens when LCB_CMDN1QL_F_PREPCACHE is set, we are disabling this flag by default - https://github.com/couchbase/eventing/commit/6daf6338e56953bac41f3e5e5f3468ec3f31bc35
            Hide
            vikas.chaudhary Vikas Chaudhary added a comment -

            not seeing on 6.0.4-3082

             

            Show
            vikas.chaudhary Vikas Chaudhary added a comment - not seeing on 6.0.4-3082  

              People

              Assignee:
              Gautham.Banasandra Gautham Banasandra (Inactive)
              Reporter:
              vikas.chaudhary Vikas Chaudhary
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Gerrit Reviews

                  There are no open Gerrit changes

                    PagerDuty