Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-50455

[System Test] Query service not responding on 2 query nodes

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 7.1.0
    • 7.1.0
    • fts

    Description

      Build : 7.1.0-2081
      Test : -test tests/integration/neo/test_neo_couchstore_milestone4.yml -scope tests/integration/neo/scope_couchstore.yml
      Scale : 3
      Iteration : 1st

      On the longevity cluster, currently there are 3 query nodes : 172.23.104.155, 172.23.104.157 and 172.23.104.137. Out of these, 172.23.104.155 & 172.23.104.157 are not responsive. Cannot connect via cbq to these nodes.

      [root@localhost bin]# ./cbq -e 172.23.104.155:8093 -u Administrator -p password
      ^C --> I aborted after waiting for a few minutes.
      [root@localhost bin]# ./cbq -e 172.23.104.157:8093 -u Administrator -p password
       ERROR 100 : N1QL: Unable to connect to N1QL endpoint: http://Administrator:password@172.23.104.157:8093/query/service
      HTTP ERR: 503 Service Unavailable
       
       
       Path to history file for the shell : /root/.cbq_history
       
      cbq> [root@localhost bin]# ./cbq -e 172.23.104.137:8093 -u Administrator -p password
       Connected to : http://172.23.104.137:8093/. Type Ctrl-D or \QUIT to exit.
       
       Path to history file for the shell : /root/.cbq_history
      cbq> select 1;
      {
          "requestID": "b2045798-6856-4451-941b-43bf004a664b",
          "signature": {
              "$1": "number"
          },
          "results": [
          {
              "$1": 1
          }
          ],
          "status": "success",
          "metrics": {
              "elapsedTime": "547.462µs",
              "executionTime": "425.651µs",
              "resultCount": 1,
              "resultSize": 23,
              "serviceLoad": 6
          }
      }

      Attached are the heap dumps, goroutine dumps, cpu and memory profiles and active/completed request dumps from all the 3 nodes.

      Active requests for 172.23.104.155 shows 1230 requests - mostly in submitted state. The Active requests for 172.23.104.157 has 200+ such requests.

      The attached cbcollectinfo is from the time when this issue was observed first in the eagle-eye tool. We have logs from before and after this time as well, so please let us know if you wanted logs from a specific time.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            mihir.kamdar Mihir Kamdar (Inactive)
            mihir.kamdar Mihir Kamdar (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty