Uploaded image for project: 'Couchbase Documentation'
  1. Couchbase Documentation
  2. DOC-8481

FTS - Add documentation for runtime query supervising capabilities

    XMLWordPrintable

Details

    • 1

    Description

      Search service users might want to monitor the currently running active queries in a node for FTS. This would help them gain insights into the ongoing slow-queries or other debugging purposes.

      FTS has a new REST endpoint for serving this capability.

       

      The users could use the `/api/query` endpoint for obtaining details about all the active queries in any FTS node in a cluster.

       

      Example:

      curl -XGET -H "Content-Type: application/json" -u<UserName:Password> http://localhost:9200/api/query

       

      This endpoint takes an optional `longerThan` argument which can be used to filter the queries running beyond the given span of time.

       

      curl -XGET -H "Content-Type: application/json" -uAdministrator:asdasd 'http://localhost:9200/api/query?longerThan=10s'

       

      The users could use the `/api/query/index/{indexName}` endpoint for obtaining details about all the active queries against any given FTS index in the system. And the `longerThan` filter parameter is applicable here as well.

       

      curl -XGET -H "Content-Type: application/json" -uAdministrator:asdasd 'http://localhost:9200/api/query/index/<indexName>?longerThan=1ms

       

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          hey Thejas Orkombu, let's use this ticket to track any pending query supervisor fixes for 7.1 documentation.

          Sreekanth Sivasankaran Sreekanth Sivasankaran added a comment - hey Thejas Orkombu , let's use this ticket to track any pending query supervisor fixes for 7.1 documentation.
          thejas.orkombu Thejas Orkombu added a comment - - edited

          Monitor Runtime Queries
          FTS provides REST endpoints to supervise the runtime queries. The endpoints can be hit on any of the Search nodes and get the cluster wide information pertaining to the active queries in the system. So, there is a scatter-gather operation involved in order to get the queries running on different search nodes. Furthermore, users can format their request to filter the queries based on duration and index name.
           
          Users can also use the query monitor feature in the UI to supervise the running queries for an index across cluster and abort any long running queries. To get to the query monitor:

          1. In the left pane, click Search.
          2. Click the index of choice from the list of indexes.
          3. Click on the Details tab on the expanded row of the index.
          4. On the top bar, click on Query Monitor and you should be on the query monitor page.

          <query monitor preview image>
          If there are any active queries in the system they should pop up on this page.

          For example:
          <image the query monitor>

          On top of the list of active queries, a user can restrict the active queries displayed to those which are running longer than the value specified in the Longer than field. For example, this is the list of queries running longer than 10s
          <query monitor with longer than filter image>

          Furthermore, a user can freeze the monitor to the current state by clicking on Pause button and the monitor won’t try to fetch the latest state of the system and update the monitor page. This can be useful when a user wants to have thorough look on the list of queries and query specific information.
          Since having these long running queries is not desirable, especially when a whole lot of the queries are running for a long time and consuming resources, a user can also abort the long running query. This is only possible to do when the monitor state is Active, since during the freeze state query could potentially be stale and already cancelled.
          <abort message image>

          API Query Index

          /api/query/index/{indexName}

          The users can use the /api/query/index/{indexName} endpoint to get the details of all the active queries for any given FTS index in the system. With this endpoint, the users can also use the longerThan argument to filter the queries running beyond the given span of time.

          The longerThan duration string is a signed sequence of decimal numbers, each with optional fraction and a unit suffix, such as "20s", "-1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".
          For example:

          curl -XGET -H "Content-Type: application/json" -u <username>:<password> \
          'http://localhost:8094/api/query/index/DemoIndex1?longerThan=1ms'

          Output:

          {
            "status": "ok",
            "stats": {
              "total": 3,
              "successful": 3
            },
            "totalActiveQueryCount": 4,
            "filteredActiveQueries": {
              "indexName": "DemoIndex1",
              "longerThan": "1ms",
              "queryCount": 2,
              "queryMap": {
                "b91d75480470f979f65f04e8f20a1f7b-16": {
                  "QueryContext": {
                    "query": {
                      "query": "good restraunts in france"
                    },
                    "size": 10,
                    "from": 0,
                    "timeout": 120000,
                    "index": "DemoIndex1"
                  },
                  "executionTime": "1.059754811s"
                },
                "f76b2d51397feee28c1e757ed426ef93-2": {
                  "QueryContext": {
                    "query": {
                      "query": "mexican food in england"
                    },
                    "size": 10,
                    "from": 0,
                    "timeout": 120000,
                    "index": "DemoIndex1"
                  },
                  "executionTime": "1.058247896s"
                }
              }
            }
          }

          "stats" field indicates the scatter-gather specific information. That is to say, "total" indicates the number of fts nodes participating in the scatter gather, "success" means the number of nodes that returned successfully and "failed" indicates the number of nodes that returned with an error.
          In case of a failure from one or more nodes, the endpoint can return partial results even when there is a single node that returned successfully and the errors that caused the failure on certain nodes would be captured in the "errors" field with the failed nodes uuid as key and the error as value.
          The "totalActiveQueryCount" talks about the number of active queries in the system irrespective of whether queries to a specific index or those which are running longer than a particular duration were requested. However, the "filteredActiveQueries" are the query map entries after filtering out index specific queries and also the queries that are running longer than the "longerThan" value. The value specified in "queryCount" is the number of queries which meet the filtering conditions mentioned in the request.
          Another important thing to note is the keys of the query map entries. These are of the form "hashValue-queryID", where hashValue is the uuid of the coordinator for that particular "QueryContext" (i.e. the actual query) which was run and queryID is the query’s local ID on the coordinator node.
          "executionTime" specifies the duration for which this query has been running.
           
          API query

          /api/query

          The users can use the /api/query endpoint to get the details of all the active queries in any FTS node in a cluster.

          For example

          curl -XGET -H "Content-Type: application/json" -u <username>:<password> \
          http://localhost:8094/api/query 

          Output

          {
            "status": "ok",
            "stats": {
              "total": 3,
              "successful": 3
            },
            "totalActiveQueryCount": 4,
            "filteredActiveQueries": {
              "queryCount": 4,
              "queryMap": {
                "b91d75480470f979f65f04e8f20a1f7b-17": {
                  "QueryContext": {
                    "query": {
                      "query": "good restraunts in france"
                    },
                    "size": 10,
                    "from": 0,
                    "timeout": 120000,
                    "index": "DemoIndex1"
                  },
                  "executionTime": "2.144802122s"
                },
                "b91d75480470f979f65f04e8f20a1f7b-18": {
                  "QueryContext": {
                    "query": {
                      "query": "decent hotel with a pool in italy"
                    },
                    "size": 10,
                    "from": 0,
                    "timeout": 120000,
                    "index": "DemoIndex2"
                  },
                  "executionTime": "2.144712787s"
                },
                "b91d75480470f979f65f04e8f20a1f7b-19": {
                  "QueryContext": {
                    "query": {
                      "query": "germany"
                    },
                    "size": 10,
                    "from": 0,
                    "timeout": 120000,
                    "index": "DemoIndex2"
                  },
                  "executionTime": "2.143957727s"
                },
                "f76b2d51397feee28c1e757ed426ef93-3": {
                  "QueryContext": {
                    "query": {
                      "query": "mexican food in england"
                    },
                    "size": 10,
                    "from": 0,
                    "timeout": 120000,
                    "index": "DemoIndex1"
                  },
                  "executionTime": "2.14286421s"
                }
              }
            }
          } 

          The api/query endpoint takes an optional argument "longerThan". With this argument, the users can filter the queries running beyond the given span of time. For example

          curl -XGET -H "Content-Type: application/json" -u <username>:<password> \
          'http://localhost:8094/api/query?longerThan=7s' 

          Output

          {
            "status": "ok",
            "stats": {
              "total": 3,
              "successful": 3
            },
            "totalActiveQueryCount": 3,
            "filteredActiveQueries": {
              "longerThan": "7s",
              "queryCount": 1,
              "queryMap": {
                "b91d75480470f979f65f04e8f20a1f7b-21": {
                  "QueryContext": {
                    "query": {
                      "query": "decent hotel with a pool in italy"
                    },
                    "size": 10,
                    "from": 0,
                    "timeout": 120000,
                    "index": "DemoIndex1"
                  },
                  "executionTime": "10.541956741s"
                }
              }
            }
          }

          API Cancel Query

          /api/query/{queryID}/cancel

          The users can use the /api/query/{queryID}/cancel endpoint to cancel a query with id queryID on its coordinator node whose UUID can be passed as a body parameter uuid. For example:

          curl -XGET -H "Content-Type: application/json" -u <username>:<password> \
          http://localhost:8094/api/query/24/cancel -d \
          '{ "uuid": "b91d75480470f979f65f04e8f20a1f7b" }' 

          Output:

          {
            "status": "ok",
            "msg": "query with ID '24' on node 'b91d75480470f979f65f04e8f20a1f7b' was aborted!"
          }

          thejas.orkombu Thejas Orkombu added a comment - - edited Monitor Runtime Queries FTS provides REST endpoints to supervise the runtime queries. The endpoints can be hit on any of the Search nodes and get the cluster wide information pertaining to the active queries in the system. So, there is a scatter-gather operation involved in order to get the queries running on different search nodes. Furthermore, users can format their request to filter the queries based on duration and index name.   Users can also use the query monitor feature in the UI to supervise the running queries for an index across cluster and abort any long running queries. To get to the query monitor: In the left pane, click Search. Click the index of choice from the list of indexes. Click on the Details tab on the expanded row of the index. On the top bar, click on Query Monitor and you should be on the query monitor page. <query monitor preview image> If there are any active queries in the system they should pop up on this page. For example: <image the query monitor> On top of the list of active queries, a user can restrict the active queries displayed to those which are running longer than the value specified in the Longer than field. For example, this is the list of queries running longer than 10s <query monitor with longer than filter image> Furthermore, a user can freeze the monitor to the current state by clicking on Pause button and the monitor won’t try to fetch the latest state of the system and update the monitor page. This can be useful when a user wants to have thorough look on the list of queries and query specific information. Since having these long running queries is not desirable, especially when a whole lot of the queries are running for a long time and consuming resources, a user can also abort the long running query. This is only possible to do when the monitor state is Active, since during the freeze state query could potentially be stale and already cancelled. <abort message image> API Query Index /api/query/index/{indexName} The users can use the /api/query/index/{indexName} endpoint to get the details of all the active queries for any given FTS index in the system. With this endpoint, the users can also use the longerThan argument to filter the queries running beyond the given span of time. The longerThan duration string is a signed sequence of decimal numbers, each with optional fraction and a unit suffix, such as "20s", "-1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h". For example: curl -XGET -H "Content-Type: application/json" -u <username>:<password> \ 'http://localhost:8094/api/query/index/DemoIndex1?longerThan=1ms' Output: { "status" : "ok" , "stats" : { "total" : 3 , "successful" : 3 }, "totalActiveQueryCount" : 4 , "filteredActiveQueries" : { "indexName" : "DemoIndex1" , "longerThan" : "1ms" , "queryCount" : 2 , "queryMap" : { "b91d75480470f979f65f04e8f20a1f7b-16" : { "QueryContext" : { "query" : { "query" : "good restraunts in france" }, "size" : 10 , "from" : 0 , "timeout" : 120000 , "index" : "DemoIndex1" }, "executionTime" : "1.059754811s" }, "f76b2d51397feee28c1e757ed426ef93-2" : { "QueryContext" : { "query" : { "query" : "mexican food in england" }, "size" : 10 , "from" : 0 , "timeout" : 120000 , "index" : "DemoIndex1" }, "executionTime" : "1.058247896s" } } } } "stats" field indicates the scatter-gather specific information. That is to say, "total" indicates the number of fts nodes participating in the scatter gather, "success" means the number of nodes that returned successfully and "failed" indicates the number of nodes that returned with an error. In case of a failure from one or more nodes, the endpoint can return partial results even when there is a single node that returned successfully and the errors that caused the failure on certain nodes would be captured in the "errors" field with the failed nodes uuid as key and the error as value. The "totalActiveQueryCount" talks about the number of active queries in the system irrespective of whether queries to a specific index or those which are running longer than a particular duration were requested. However, the "filteredActiveQueries" are the query map entries after filtering out index specific queries and also the queries that are running longer than the "longerThan" value. The value specified in "queryCount" is the number of queries which meet the filtering conditions mentioned in the request. Another important thing to note is the keys of the query map entries. These are of the form "hashValue-queryID", where hashValue is the uuid of the coordinator for that particular "QueryContext" (i.e. the actual query) which was run and queryID is the query’s local ID on the coordinator node. "executionTime" specifies the duration for which this query has been running.   API query /api/query The users can use the /api/query endpoint to get the details of all the active queries in any FTS node in a cluster. For example curl -XGET -H "Content-Type: application/json" -u <username>:<password> \ http: //localhost:8094/api/query Output { "status" : "ok" , "stats" : { "total" : 3 , "successful" : 3 }, "totalActiveQueryCount" : 4 , "filteredActiveQueries" : { "queryCount" : 4 , "queryMap" : { "b91d75480470f979f65f04e8f20a1f7b-17" : { "QueryContext" : { "query" : { "query" : "good restraunts in france" }, "size" : 10 , "from" : 0 , "timeout" : 120000 , "index" : "DemoIndex1" }, "executionTime" : "2.144802122s" }, "b91d75480470f979f65f04e8f20a1f7b-18" : { "QueryContext" : { "query" : { "query" : "decent hotel with a pool in italy" }, "size" : 10 , "from" : 0 , "timeout" : 120000 , "index" : "DemoIndex2" }, "executionTime" : "2.144712787s" }, "b91d75480470f979f65f04e8f20a1f7b-19" : { "QueryContext" : { "query" : { "query" : "germany" }, "size" : 10 , "from" : 0 , "timeout" : 120000 , "index" : "DemoIndex2" }, "executionTime" : "2.143957727s" }, "f76b2d51397feee28c1e757ed426ef93-3" : { "QueryContext" : { "query" : { "query" : "mexican food in england" }, "size" : 10 , "from" : 0 , "timeout" : 120000 , "index" : "DemoIndex1" }, "executionTime" : "2.14286421s" } } } } The api/query endpoint takes an optional argument "longerThan". With this argument, the users can filter the queries running beyond the given span of time. For example curl -XGET -H "Content-Type: application/json" -u <username>:<password> \ 'http://localhost:8094/api/query?longerThan=7s' Output { "status" : "ok" , "stats" : { "total" : 3 , "successful" : 3 }, "totalActiveQueryCount" : 3 , "filteredActiveQueries" : { "longerThan" : "7s" , "queryCount" : 1 , "queryMap" : { "b91d75480470f979f65f04e8f20a1f7b-21" : { "QueryContext" : { "query" : { "query" : "decent hotel with a pool in italy" }, "size" : 10 , "from" : 0 , "timeout" : 120000 , "index" : "DemoIndex1" }, "executionTime" : "10.541956741s" } } } } API Cancel Query /api/query/{queryID}/cancel The users can use the /api/query/{queryID}/cancel endpoint to cancel a query with id queryID on its coordinator node whose UUID can be passed as a body parameter uuid. For example: curl -XGET -H "Content-Type: application/json" -u <username>:<password> \ http: //localhost:8094/api/query/24/cancel -d \ '{ "uuid": "b91d75480470f979f65f04e8f20a1f7b" }' Output: { "status" : "ok" , "msg" : "query with ID '24' on node 'b91d75480470f979f65f04e8f20a1f7b' was aborted!" }

          People

            thejas.orkombu Thejas Orkombu
            Sreekanth Sivasankaran Sreekanth Sivasankaran
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty