Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-51289

[System Test] Queries stuck in timeout/submitted stage since 9+ hrs blocking rebalance

    XMLWordPrintable

Details

    Description

      Build : 7.1.0-2416
      Test : -test tests/integration/neo/test_neo_couchstore_milestone4.yml -scope tests/integration/neo/scope_couchstore.yml
      Scale : 3
      Iteration : 1st

      There are 144 queries right now that are stuck in "timeout" or "submitted" state and is blocking a rebalance operation to add a new query node (172.23.104.137) to the cluster. This rebalance operation has been in progress for 5.5+ hrs, out of which more than 5 hrs it has been in the query service rebalance phase due to this issue.

      select state,count(*) from system:active_requests where state!="running" group by state
      [
        {
          "$1": 128,
          "state": "timeout"
        },
        {
          "$1": 16,
          "state": "submitted"
        }
      ]
      

      1. Do we really need graceful shutdown when adding a new query node to the cluster ?
      2. What is causing these queries to time out and be in the submitted state ?

      Query nodes : 172.23.104.137, 172.23.104.155, 172.23.104.157

      Attached :
      1. cbcollect
      2. active_requests dumps from all 3 nodes

      Not sure if this is a regression or related to a recent change in the longevity to run N1QL statements in JS UDF. The previous run of the same test with 7.1.0-2400 did not show this issue.

      UPDATE: rebalance completed successfully after I manually cancelled all the above queries.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              mihir.kamdar Mihir Kamdar (Inactive)
              mihir.kamdar Mihir Kamdar (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty