Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-41067

optimize 5 sec /getIndexStatus calls from UI

    XMLWordPrintable

Details

    • 1

    Description

      UI makes /getIndexStatus REST API call every 5 seconds to indexer. Indexer gathers index information from all indexer nodes in the cluster and responds back.

      With 1000s of indexes, this may not scale well and the CPU cost/garbage generated is expensive.

      1. This REST API seems to be called even if UI is not being used. Is this required?

      2. Is it possible to fetch the index list for a subset e.g. default scope/collection and rest could be fetched on demand?

      3. Index Status(Ready/Created etc) and Stats(items etc) together make up the summary stats for each index line item on the UI. With prometheus, the stats will be fetched every 30 seconds(not sure if there is an option to change the interval on demand). This can make the whole index summary misleading as stats and status can be out of sync for 20-25 seconds.

      4. We can discuss more sophisticated schemes e.g. /getIndexStatus can return only the list of changed indexes based on its cache, if it is feasible for UI to deal with it. UI will still have the option to fetch all, whenever required.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            MB-43967 that supports this from the GSI side is almost complete. It has three parts:

            • Part 1: Infrastructure.
            • Part 2: Indexer internal use of ETags for index metadata.
            • Part 3: Support for ETags from external caller to /getIndexStatus endpoint.

            Parts 1 and 2 are fully working and merged to GSI's unstable branch and Jeelan Poola is planning to merge them to master soon.

            Part 3 is coded and in Gerrit but seems like it is not obeying ETag passed in from external caller, so may need an additional patch set before merging. I am out of the office until Mon 2021-03-15 and will look deeper into this then.

            FYI Deepkaran Salooja

            kevin.cherkauer Kevin Cherkauer added a comment - MB-43967 that supports this from the GSI side is almost complete. It has three parts: Part 1: Infrastructure. Part 2: Indexer internal use of ETags for index metadata. Part 3: Support for ETags from external caller to /getIndexStatus endpoint. Parts 1 and 2 are fully working and merged to GSI's unstable branch and Jeelan Poola is planning to merge them to master soon. Part 3 is coded and in Gerrit but seems like it is not obeying ETag passed in from external caller, so may need an additional patch set before merging. I am out of the office until Mon 2021-03-15 and will look deeper into this then. FYI Deepkaran Salooja
            kevin.cherkauer Kevin Cherkauer added a comment - - edited

            MB-43967 is complete. The problem I saw on Thursday afternoon was a curl problem, not a code problem; the code was working fine. (The StackOverflow comment I was working from was adamant about how to correctly quote the If-None-Match header in curl so it would work, but those details were incorrect and quoting it as they stated made curl not send that header field at all.)

            I have merged Part 3, which is the final part of the GSI implementation, to GSI's unstable branch. It now awaits just  Jeelan Poola merging unstable to master.

            Example of how to use in curl:

            1. Request status with no ETag – will always return full status (host is an indexer node and port is GSI's "httpPort"):

            % curl --dump-header curl.header.etag.out -X GET -u Administrator:password "http://localhost:9120/getIndexStatus"
             
            {"code":"success","status":[{"defnId":948398447208412855,"instId":18064156543217770273,"name":"idx_3","bucket":"default","scope":"_default","collection":"_default","secExprs":["`col_3`"],"indexType":"plasma","status":"Ready","definition":"CREATE INDEX `idx_3` ON `default`(`col_3`) WITH {  \"nodes\":[ \"127.0.0.1:9001\" ] }","hosts":["127.0.0.1:9001"],"completion":0,"progress":0,"scheduled":false,"partitioned":false,"numPartition":1,"partitionMap":{"127.0.0.1:9001":[0]},"numReplica":0,"indexName":"idx_3","replicaId":0,"stale":false,"lastScanTime":"NA"}]}

             % cat curl.header.etag.out
            HTTP/1.1 200 OK
            Content-Type: application/json
            ETag: 97b7a0cfd4ad04b9
            Date: Mon, 15 Mar 2021 18:17:22 GMT
            Content-Length: 559
            

             

            2. Request status again, passing the returned ETag in the If-None-Match field (NOT the ETag field, because REST is by nature an ex post facto hack on top of HTTP and this is the officially designated header field for callers to pass in an ETag):

            % curl --dump-header curl.header.etag.out -X GET -u Administrator:password "http://localhost:9120/getIndexStatus" --header "If-None-Match: 97b7a0cfd4ad04b9"

             % cat curl.header.etag.out
            HTTP/1.1 304 Not Modified
            ETag: 97b7a0cfd4ad04b9
            Date: Mon, 15 Mar 2021 18:17:53 GMT
            

             In #2, the ETag was still fresh and the index metadata had not changed, so the GSI server responded with code 304 "Not Modified" and an empty body. This is why the curl call in #2 does not produce any output on the command line – all it returned was a header which was dumped to the file curl.header.etag.out. The header again contains the ETag.

            The ETags are set to expire on average every 4 minutes (240 seconds). This is controlled by config setting indexing/secondary/common/config.go:

            "indexer.settings.eTagPeriod": ConfigValue{
             240,
             "Average ETag expiration period in seconds",
             240,
             false, // mutable
             false, // case-insensitive

            If after expiry absolutely nothing changed (including the lastScanTime of all the indexes), then the new ETag will come out the same as the prior ETag (since it's a checksum) and the response will still be 304 Not Modified with the same ETag. If there are scans going on, however, the lastScanTime will change for any indexes being hit and the new ETag will be different from the old one, prompting a full 200 OK response and payload with the new status data,  plus the new ETag in the header.

            Passing no ETag or a 0 ETag (which is treated as invalid) will always force a full response to be returned.

            The ETag in the header is a string hex representation of a uint64 in Go. Since it is a string in the headers the caller never needs to convert it, but only pass it back as the original string.

            Details of ETag expiry times:

            // generateETagExpiry returns the next expiration UnixNano time of ETags based on
            // the current time. The expiry will be the next future rounded-to-S-seconds time
            // that is at least S/2 seconds away, where S is specified by config variable
            // indexer.settings.eTagPeriod (currently 240). The expiry will thus average S
            // seconds in the future but can be anything between S/2 and 3S/2. (Most of the
            // time it will be very close to S because ns_server calls getIndexStatus much more
            // frequently, triggering new ETag creations soon after the prior expiry.) The rounding
            // is done to try to keep expiry times aligned across all nodes (unfortunately
            // jittered by any internode clock differences), so getIndexStatus for many nodes
            // will likely have either all or none of its individual LocalIndexMetadata ETags
            // unexpired, as if even one's ETag is expired we must send full results to caller.
            

            So GSI ETags always expire on times that are divisible by 4 minutes (e.g. hh:mm:ss 10:00:00.000 10:04:00.000, 10:08:00.000, etc.), regardless of when they are generated. If the next available expiry boundary is less than 2 minutes in the future it will use the one following that instead. Thus an ETag generated at 10:01:59.999 will expire at 10:04:00.000 while one generated at 10:02.00.001 will expire at 10:08:00.000.

            kevin.cherkauer Kevin Cherkauer added a comment - - edited MB-43967 is complete. The problem I saw on Thursday afternoon was a curl problem, not a code problem; the code was working fine. (The StackOverflow comment I was working from was adamant about how to correctly quote the If-None-Match header in curl so it would work, but those details were incorrect and quoting it as they stated made curl not send that header field at all.) I have merged Part 3, which is the final part of the GSI implementation, to GSI's unstable branch. It now awaits just  Jeelan Poola merging unstable to master. Example of how to use in curl: 1. Request status with no ETag – will always return full status (host is an indexer node and port is GSI's "httpPort"): % curl --dump-header curl.header.etag.out -X GET -u Administrator:password "http://localhost:9120/getIndexStatus"   {"code":"success","status":[{"defnId":948398447208412855,"instId":18064156543217770273,"name":"idx_3","bucket":"default","scope":"_default","collection":"_default","secExprs":["`col_3`"],"indexType":"plasma","status":"Ready","definition":"CREATE INDEX `idx_3` ON `default`(`col_3`) WITH { \"nodes\":[ \"127.0.0.1:9001\" ] }","hosts":["127.0.0.1:9001"],"completion":0,"progress":0,"scheduled":false,"partitioned":false,"numPartition":1,"partitionMap":{"127.0.0.1:9001":[0]},"numReplica":0,"indexName":"idx_3","replicaId":0,"stale":false,"lastScanTime":"NA"}]}  % cat curl.header.etag.out HTTP/1.1 200 OK Content-Type: application/json ETag: 97b7a0cfd4ad04b9 Date: Mon, 15 Mar 2021 18:17:22 GMT Content-Length: 559   2. Request status again, passing the returned ETag in the If-None-Match field (NOT the ETag field, because REST is by nature an ex post facto hack on top of HTTP and this is the officially designated header field for callers to pass in an ETag): % curl --dump-header curl.header.etag.out -X GET -u Administrator:password "http://localhost:9120/getIndexStatus" --header "If-None-Match: 97b7a0cfd4ad04b9"  % cat curl.header.etag.out HTTP/1.1 304 Not Modified ETag: 97b7a0cfd4ad04b9 Date: Mon, 15 Mar 2021 18:17:53 GMT  In #2, the ETag was still fresh and the index metadata had not changed, so the GSI server responded with code 304 "Not Modified" and an empty body. This is why the curl call in #2 does not produce any output on the command line – all it returned was a header which was dumped to the file curl.header.etag.out. The header again contains the ETag. The ETags are set to expire on average every 4 minutes (240 seconds). This is controlled by config setting indexing/secondary/common/config.go: "indexer.settings.eTagPeriod": ConfigValue{ 240, "Average ETag expiration period in seconds", 240, false, // mutable false, // case-insensitive If after expiry absolutely nothing changed (including the lastScanTime of all the indexes), then the new ETag will come out the same as the prior ETag (since it's a checksum) and the response will still be 304 Not Modified with the same ETag. If there are scans going on, however, the lastScanTime will change for any indexes being hit and the new ETag will be different from the old one, prompting a full 200 OK response and payload with the new status data,  plus the new ETag in the header. Passing no ETag or a 0 ETag (which is treated as invalid) will always force a full response to be returned. The ETag in the header is a string hex representation of a uint64 in Go. Since it is a string in the headers the caller never needs to convert it, but only pass it back as the original string. Details of ETag expiry times: // generateETagExpiry returns the next expiration UnixNano time of ETags based on // the current time. The expiry will be the next future rounded-to-S-seconds time // that is at least S/2 seconds away, where S is specified by config variable // indexer.settings.eTagPeriod (currently 240). The expiry will thus average S // seconds in the future but can be anything between S/2 and 3S/2. (Most of the // time it will be very close to S because ns_server calls getIndexStatus much more // frequently, triggering new ETag creations soon after the prior expiry.) The rounding // is done to try to keep expiry times aligned across all nodes (unfortunately // jittered by any internode clock differences), so getIndexStatus for many nodes // will likely have either all or none of its individual LocalIndexMetadata ETags // unexpired, as if even one's ETag is expired we must send full results to caller. So GSI ETags always expire on times that are divisible by 4 minutes (e.g. hh:mm:ss 10:00:00.000 10:04:00.000, 10:08:00.000, etc.), regardless of when they are generated. If the next available expiry boundary is less than 2 minutes in the future it will use the one following that instead. Thus an ETag generated at 10:01:59.999 will expire at 10:04:00.000 while one generated at 10:02.00.001 will expire at 10:08:00.000.

            MB-43967 is all in CC master as of build couchbase-server-7.0.0-4701.

            kevin.cherkauer Kevin Cherkauer added a comment - MB-43967 is all in CC master as of build couchbase-server-7.0.0-4701.

            Build couchbase-server-7.0.0-4711 contains ns_server commit 727df1b with commit message:
            MB-41067: Implement caching for getIndexStatus with Etag

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.0.0-4711 contains ns_server commit 727df1b with commit message: MB-41067 : Implement caching for getIndexStatus with Etag

            verified in 7.0.0-4990

            created ~800 indexes, updated them, we see pretty good performance on updating the indexes on the UI, it seems faster than before the fix but I did not really measure time, at the very least its not worse, we see the etag entry in the logs as well. I mass updated a hundred indexes and it seemed to work fine and reflect fast on the UI

            ajay.bhullar Ajay Bhullar added a comment - verified in 7.0.0-4990 created ~800 indexes, updated them, we see pretty good performance on updating the indexes on the UI, it seems faster than before the fix but I did not really measure time, at the very least its not worse, we see the etag entry in the logs as well. I mass updated a hundred indexes and it seemed to work fine and reflect fast on the UI

            People

              ajay.bhullar Ajay Bhullar
              deepkaran.salooja Deepkaran Salooja
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty