Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-51940

[N1QL] n1ql query failed due to crashed query service

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Major
    • Morpheus
    • 7.0.1
    • query
    • None
    • 7.0.1
    • Untriaged
    • 1
    • Unknown

    Description

      Against our QE Server Manager instance (running 7.0.1 CB Server), the test suite dispatcher job ran this query as apart of a job and we saw this behavior

      here is the job that was dispatched and failed

      http://qa.sc.couchbase.com/job/test_suite_dispatcher/69811/console

      here is the query that was run and the error we see

      ('the query is', "select * from `QE-Test-Suites` where 'weekly' in partOf and component in ['upgrade'] and subcomponent in ['ce-new-from-60x_1b'];")
      Traceback (most recent call last):
        File "scripts/testDispatcher.py", line 943, in <module>
          main()
        File "scripts/testDispatcher.py", line 375, in main
          for row in results:
        File "/usr/local/lib/python3.7/site-packages/couchbase/n1ql.py", line 542, in __iter__
          raw_rows = self.raw.fetch(self._mres)
      couchbase.exceptions._TimeoutError_0x17 (generated, catch TimeoutError): <RC=0x17[Client-Side timeout exceeded for operation. Inspect network conditions or increase the timeout], HTTP Request failed. Examine 'objextra' for full result, Results=1, C Source=(src/http.c,144), OBJ=ViewResult<rc=0x17[Client-Side timeout exceeded for operation. Inspect network conditions or increase the timeout], value=None, http_status=0, tracing_context=0, tracing_output=None>, Tracing Output={":nokey:0": null}>
      

      Going to the node that was queried (which is running 7.0.1) we see that the query service crashed, it is unclear whether this is due the query or if the query failed because the query service crashed. The panic seems to be due to some kind of order by or limit but the above query doesn't seem to contain order by or limit. Logs will be attached

      2022-04-21T15:42:52.709-07:00 [Info] GSIC[default/QE-mobile-pool-_default-_default-1649242372127826429] logstats "QE-mobile-pool" {"gsi_scan_count":178375,"gsi_scan_duration":17627670806628,"gsi_throttle_duration":0,"gsi_prime_duration":15599666653855,"gsi_blocked_duration":0,"gsi_total_temp_files":0}
      panic: runtime error: invalid memory address or nil pointer dereference
              panic: runtime error: invalid memory address or nil pointer dereference
      [signal SIGSEGV: segmentation violation code=0x1 addr=0x228 pc=0x1cb775d]
       
      goroutine 87197767 [running]:
      github.com/couchbase/query/execution.(*Order).releaseValues(0x0)
              /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/query/execution/order.go:111 +0x2d
      panic(0x22007a0, 0x38b77a0)
              /home/couchbase/.cbdepscache/exploded/x86_64/go-1.13.7/go/src/runtime/panic.go:679 +0x1b2
      github.com/couchbase/query/execution.(*OrderLimit).RunOnce(0xc00a4c0680, 0xc002c28c80, 0x0, 0x0)
              /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/query/execution/order_limit.go:93 +0x5c
      github.com/couchbase/query/execution.execOp(0x2804de0, 0xc00a4c0680, 0xc002c28c80, 0x0, 0x0)
              /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/query/execution/base.go:505 +0x54
      created by github.com/couchbase/query/execution.(*base).fork
              /home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/query/execution/base.go:516 +0x103
      

      node that had the crash was 172.23.120.140

      https://cb-jira.s3.us-east-2.amazonaws.com/logs/server_mgr_query_crash_042522/collectinfo-2022-04-25T185603-ns_1%40172.23.104.162.zip
      https://cb-jira.s3.us-east-2.amazonaws.com/logs/server_mgr_query_crash_042522/collectinfo-2022-04-25T185603-ns_1%40172.23.120.139.zip
      https://cb-jira.s3.us-east-2.amazonaws.com/logs/server_mgr_query_crash_042522/collectinfo-2022-04-25T185603-ns_1%40172.23.120.140.zip
      https://cb-jira.s3.us-east-2.amazonaws.com/logs/server_mgr_query_crash_042522/collectinfo-2022-04-25T185603-ns_1%40172.23.124.12.zip

      It is unclear if this affects other versions as we do not know what specifically caused this panic/crash

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            ajay.bhullar Ajay Bhullar
            ajay.bhullar Ajay Bhullar
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty