Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-48716

Updating "exp_pager_stime" (using cbepctl) results in a delay before the expiry pager runs at the new time interval.

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 7.1.0
    • 7.1.0
    • couchbase-bucket
    • Enterprise Edition 7.1.0 build 1332
    • 1
    • Yes
    • KV-Engine-Sept-21, KV 2021-Oct-21

    Description

      Description

      In builds greater than or equal to 1332:

      Updating "exp_pager_stime" using cbepctl results in a delay of a few minutes before the expiry pager runs at the new time interval.

      Steps to reproduce:
      On a single Kv node:
      1. Create a bucket named 'default' with the default settings.
      2. Run the following cbepctl command to update 'exp_pager_stime' and display the date:

      ebepctl command

      # /opt/couchbase/bin/cbepctl localhost:11210 -u Administrator -p password -b 'default' set flush_param exp_pager_stime 10 && date
      

      What happens on build 1331:

      The command completes at 14:25:57:

      cbepctl command output followed by date command

      setting param: exp_pager_stime 10
      set exp_pager_stime to 10
      Fri  1 Oct 14:25:57 BST 2021
      

      The changes take effect immediately and the first run happens between 14:26:15 and 14:26:30:

      Logs from build 1331: https://cb-engineering.s3.amazonaws.com/MB-48716/kv-qe/collectinfo-2021-10-01T141004-ns_1%40127.0.0.1.zip

      What happens on build 1332:

      The command completes at 15:19:03:

      cbepctl command output followed by date command

      setting param: exp_pager_stime 10
      set exp_pager_stime to 10
      Fri  1 Oct 15:19:03 BST 2021
      

      There is a delay of roughly 9 minutes before the changes take effect as the first run happens between 15:28:15 and 15:28:30:

      Logs from build 1332: https://cb-engineering.s3.amazonaws.com/MB-48716/kv-qe/collectinfo-2021-10-01T143535-ns_1%40127.0.0.1.zip

      (Hypothesis) What may be the cause:

      On builds < 1332:

      When "exp_pager_stime" is updated, one of SetExpiryPagerTimerTask or SetExpiryTimerSleepTime is called which creates a new ExpiryItemPager object.

      Each ExpiryItemPager has its own semaphore which controls the number of PagingVisitors that can service the ExpiryItemPager.

      A PagingVisitor can immediately service the newly created ExpiryItemPager, which explains the behaviour where the changes take effect immediately.

      There can be multiple ExpiryItemPager objects (see: MB-41403). As each of them owns a semaphore, there can be more PagingVisitors visiting ExpiryItemPagers than intended.

      On builds >= 1332:

      In order to control the number of PagingVisitors, an optimisation (see: http://review.couchbase.org/c/kv_engine/+/161944) was added as part of MB-41403, to prevent new ExpiryItemPager objects being created when SetExpiryPagerTimerTask or SetExpiryTimerSleepTime is called.

      A side effect of this optimisation results in the updates to the interval taking effect once the previous run completes which can take the time period the previous 'exp_pager_stime' was set to.

      Possible impact on QE:

      Many expiry tests rely on having fine grained control over when the expiry pager runs.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              ashwin.govindarajulu Ashwin Govindarajulu
              asad.zaidi Asad Zaidi (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty