Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-36356

[System Test] [Longevity] pausing of eventing handler is stuck since 15 hrs

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Test Blocker
    • Resolution: Duplicate
    • 6.5.0
    • 6.5.0
    • eventing
    • component cluster
    • Untriaged
    • Unknown

    Description

      Build : 6.5.0-4471
      Test : -test tests/integration/test_allFeatures_madhatter_durability.yml -scope tests/integration/scope_Xattrs_Madhatter.yml
      Scale : 3
      Iteration : 1st

      Step:

      [2019-10-06T23:26:24-07:00, sequoiatools/eventing:6.5:cf2e4a] /eventing.py 172.23.98.135 8096 mad-hatter/bucket_op_n1ql.json Administrator password pause true
      warning using 'json' filter:  unexpected end of JSON input []
      warning using 'json' filter:  unexpected end of JSON input []
      

      Pausing eventing handler started at 2019-10-06T23:26:24-07:00 and below logs taken at 2019-10-07 01:11:08,585.

      logs:

      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.104.156.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.104.157.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.104.164.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.104.5.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.104.61.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.104.67.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.104.69.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.104.70.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.104.87.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.104.88.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.106.100.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.106.188.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.108.103.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.96.148.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.96.251.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.96.252.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.96.253.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.96.95.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.97.119.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.97.121.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.97.122.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.97.239.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.97.242.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.98.135.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.99.11.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.99.21.zip
      url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1570434126/collectinfo-2019-10-07T074208-ns_1%40172.23.99.25.zip
      2019-10-07 01:11:08,585 - systestmon - INFO - ====== Log scan iteration number 39 complete. Sleeping for 3600 seconds ======

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          Bulk closing all invalid, duplicate and won't fix bugs. Please feel free to reopen them

          raju Raju Suravarjjala added a comment - Bulk closing all invalid, duplicate and won't fix bugs. Please feel free to reopen them
          jeelan.poola Jeelan Poola added a comment -

          The symptoms and root cause here seem very similar to MB-36095 which is now fixed. The Toy with fix was tested against perf tests which reported the same issue in runs listed below successfully. Hence resolving it as duplicate. Please reopen if the issue is seen again.
          http://perf.jenkins.couchbase.com/job/themis-dev/898/console
          http://perf.jenkins.couchbase.com/job/themis-dev/899/console

          jeelan.poola Jeelan Poola added a comment - The symptoms and root cause here seem very similar to MB-36095 which is now fixed. The Toy with fix was tested against perf tests which reported the same issue in runs listed below successfully. Hence resolving it as duplicate. Please reopen if the issue is seen again. http://perf.jenkins.couchbase.com/job/themis-dev/898/console http://perf.jenkins.couchbase.com/job/themis-dev/899/console
          jeelan.poola Jeelan Poola added a comment - - edited

          Log file from node 172.23.98.135 is full of below logs

          2019-10-06T22:50:41.013-07:00 [Info] eventing-consumer [worker_bucket_op_n1ql_2:/tmp/127.0.0.1:8091_2_156534221.sock:7804] file : /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/libs/query/src/helper.cc line : 202 caller : GetErrorCodes : Returning empty valuefile : /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/libs/src/js_exception.cc line : 216 caller : CustomErrorCtor : Returning empty valuefile : /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/libs/src/js_exception.cc line : 182 caller : NewCustomError : Returning empty valueUnable to construct N1QLError : Unable to instantiate N1QLError
          2019-10-06T22:50:41.015-07:00 [Info] eventing-consumer [worker_bucket_op_n1ql_0:/tmp/127.0.0.1:8091_0_156534221.sock:7805] file : /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/libs/query/src/helper.cc line : 202 caller : GetErrorCodes : Returning empty valuefile : /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/libs/src/js_exception.cc line : 216 caller : CustomErrorCtor : Returning empty valuefile : /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/libs/src/js_exception.cc line : 182 caller : NewCustomError : Returning empty valueUnable to construct N1QLError : Unable to instantiate N1QLError
          2019-10-06T22:50:41.015-07:00 [Info] eventing-consumer [worker_bucket_op_n1ql_0:/tmp/127.0.0.1:8091_0_156534221.sock:7805] file : /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/libs/query/src/helper.cc line : 202 caller : GetErrorCodes : Returning empty valuefile : /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/libs/src/js_exception.cc line : 216 caller : CustomErrorCtor : Returning empty valuefile : /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/libs/src/js_exception.cc line : 182 caller : NewCustomError : Returning empty valueUnable to construct N1QLError : Unable to instantiate N1QLError
          

          Gautham Banasandra We need to fix above flood of logs. Else it will quickly overflow log file and we will lose ability to troubleshoot when this happens.

          The only instance of pause in log file began at

          2019-10-06T23:26:31.483-07:00 [Info] SuperSupervisor::SettingsChangeCallback [2] Function: bucket_op_n1ql current state: 2 requested status for deployment: true processing: false
          2019-10-06T23:26:31.483-07:00 [Info] SuperSupervisor::SettingsChangeCallback [2] Function: bucket_op_n1ql begin pausing process
          

          And completed at

          2019-10-06T23:29:32.551-07:00 [Info] Producer::Serve [bucket_op_n1ql:0] Closed stop chan and app log writer handle
          2019-10-06T23:29:32.551-07:00 [Info] SuperSupervisor::SettingsChangeCallback [2] Function: bucket_op_n1ql Cleaned up running Eventing.Producer instance
          2019-10-06T23:29:32.551-07:00 [Info] SuperSupervisor::SettingsChangeCallback [2] Function: bucket_op_n1ql pausing done
          

          Likely the logs have rolled over.

          jeelan.poola Jeelan Poola added a comment - - edited Log file from node 172.23.98.135 is full of below logs 2019-10-06T22:50:41.013-07:00 [Info] eventing-consumer [worker_bucket_op_n1ql_2:/tmp/127.0.0.1:8091_2_156534221.sock:7804] file : /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/libs/query/src/helper.cc line : 202 caller : GetErrorCodes : Returning empty valuefile : /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/libs/src/js_exception.cc line : 216 caller : CustomErrorCtor : Returning empty valuefile : /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/libs/src/js_exception.cc line : 182 caller : NewCustomError : Returning empty valueUnable to construct N1QLError : Unable to instantiate N1QLError 2019-10-06T22:50:41.015-07:00 [Info] eventing-consumer [worker_bucket_op_n1ql_0:/tmp/127.0.0.1:8091_0_156534221.sock:7805] file : /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/libs/query/src/helper.cc line : 202 caller : GetErrorCodes : Returning empty valuefile : /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/libs/src/js_exception.cc line : 216 caller : CustomErrorCtor : Returning empty valuefile : /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/libs/src/js_exception.cc line : 182 caller : NewCustomError : Returning empty valueUnable to construct N1QLError : Unable to instantiate N1QLError 2019-10-06T22:50:41.015-07:00 [Info] eventing-consumer [worker_bucket_op_n1ql_0:/tmp/127.0.0.1:8091_0_156534221.sock:7805] file : /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/libs/query/src/helper.cc line : 202 caller : GetErrorCodes : Returning empty valuefile : /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/libs/src/js_exception.cc line : 216 caller : CustomErrorCtor : Returning empty valuefile : /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/eventing/libs/src/js_exception.cc line : 182 caller : NewCustomError : Returning empty valueUnable to construct N1QLError : Unable to instantiate N1QLError Gautham Banasandra We need to fix above flood of logs. Else it will quickly overflow log file and we will lose ability to troubleshoot when this happens. The only instance of pause in log file began at 2019-10-06T23:26:31.483-07:00 [Info] SuperSupervisor::SettingsChangeCallback [2] Function: bucket_op_n1ql current state: 2 requested status for deployment: true processing: false 2019-10-06T23:26:31.483-07:00 [Info] SuperSupervisor::SettingsChangeCallback [2] Function: bucket_op_n1ql begin pausing process And completed at 2019-10-06T23:29:32.551-07:00 [Info] Producer::Serve [bucket_op_n1ql:0] Closed stop chan and app log writer handle 2019-10-06T23:29:32.551-07:00 [Info] SuperSupervisor::SettingsChangeCallback [2] Function: bucket_op_n1ql Cleaned up running Eventing.Producer instance 2019-10-06T23:29:32.551-07:00 [Info] SuperSupervisor::SettingsChangeCallback [2] Function: bucket_op_n1ql pausing done Likely the logs have rolled over.

          Girish Benakappa Which are the eventing nodes in the above list? Also, it would be great to create a supportal snapshot.

          jeelan.poola Jeelan Poola added a comment - Girish Benakappa Which are the eventing nodes in the above list? Also, it would be great to create a supportal snapshot.

          People

            Gautham.Banasandra Gautham Banasandra (Inactive)
            girish.benakappa Girish Benakappa
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty