Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-43256

there are no bucket stats in test with 1000 collections on build 7.0.0-3938

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • Unknown

    Description

      There are one scope and 1000 collections in this test. The test will insert 20M docs and then check bucket stats to make sure replica count matches. When the test checks bucket stats, there is no bucket stats. This issue is reproducible on build 7.0.0-3938.

      http://172.23.133.13:8091/pools/default/buckets/bucket-1/stats

      {"op":{"samples":{"timestamp":[]},"samplesCount":60,"isPersistent":true,"lastTStamp":0,"interval":1000},"hot_keys":[]}

      Build: 7.0.0-3938

      Job: http://perf.jenkins.couchbase.com/job/ares/18578/ 

      Logs:

      https://s3.amazonaws.com/bugdb/jira/qe/collectinfo-2020-12-11T202854-ns_1%40172.23.133.13.zip

      https://s3.amazonaws.com/bugdb/jira/qe/collectinfo-2020-12-11T202854-ns_1%40172.23.133.14.zip

      Note that,

      1. Another test using default collection doesn't hit this issue on build 7.0.0-3938.
      2. The same test doesn't hit this issue on build 7.0.0-4007. Therefore, I'm not sure if the issue is fixed or it just happens intermittently.
        http://perf.jenkins.couchbase.com/job/ares/18580/ 

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            dfinlay Dave Finlay added a comment -

            Timofey - I think this is a dupe of one already filed. Can you reply on it?

            dfinlay Dave Finlay added a comment - Timofey - I think this is a dupe of one already filed. Can you reply on it?

            Dave Finlay I don't think this is a dupe. At least I don't remember similar tickets. I will take a look at logs.

            timofey.barmin Timofey Barmin added a comment - Dave Finlay I don't think this is a dupe. At least I don't remember similar tickets. I will take a look at logs.
            timofey.barmin Timofey Barmin added a comment - - edited

            Bucket stats are missing because request to prometheus times-out. This is actually the same issue that caused MB-43101. The problem was fixed in build 7.0.0-3990, which explains why you don't see the problem anymore.

            In build 7.0.0-3938 there was a problem: mem_used stat for bucket and mem_used stat for collection had the same name. This led to the fact that every time you request bucket stats you essentially extract many many collection stats. Which led to request timeout. In 3990 memcached fixed that issue (renamed mem_used stat for collection) and bucket stat request does not timeout anymore. But I see that the request still takes pretty long time to execute (2-3 seconds), though. Probably we need to optimize bucket stats extraction.

            UPDATE: False alarm. High latency for me was caused mostly by vpn. If the request is executed locally it takes about 600ms. Which is still not fast, but much better.

            timofey.barmin Timofey Barmin added a comment - - edited Bucket stats are missing because request to prometheus times-out. This is actually the same issue that caused  MB-43101 . The problem was fixed in build 7.0.0-3990, which explains why you don't see the problem anymore. In build 7.0.0-3938 there was a problem: mem_used stat for bucket and mem_used stat for collection had the same name. This led to the fact that every time you request bucket stats you essentially extract many many collection stats. Which led to request timeout. In 3990 memcached fixed that issue (renamed mem_used stat for collection) and bucket stat request does not timeout anymore. But I see that the request still takes pretty long time to execute (2-3 seconds), though. Probably we need to optimize bucket stats extraction. UPDATE: False alarm. High latency for me was caused mostly by vpn. If the request is executed locally it takes about 600ms. Which is still not fast, but much better.

            I close this ticket. The issue is fixed in 7.0.0-3990. I have a good run with 7.0.0-4023.

            http://perf.jenkins.couchbase.com/job/ares/18694/ 

            bo-chun.wang Bo-Chun Wang added a comment - I close this ticket. The issue is fixed in 7.0.0-3990. I have a good run with 7.0.0-4023. http://perf.jenkins.couchbase.com/job/ares/18694/  

            People

              bo-chun.wang Bo-Chun Wang
              bo-chun.wang Bo-Chun Wang
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty