Details
-
Bug
-
Resolution: Fixed
-
Major
-
7.1.0
-
Enterprise Edition 7.1.0 build 1332
-
1
-
Yes
-
KV-Engine-Sept-21, KV 2021-Oct-21
Description
Description
In builds greater than or equal to 1332:
Updating "exp_pager_stime" using cbepctl results in a delay of a few minutes before the expiry pager runs at the new time interval.
Steps to reproduce:
On a single Kv node:
1. Create a bucket named 'default' with the default settings.
2. Run the following cbepctl command to update 'exp_pager_stime' and display the date:
ebepctl command |
# /opt/couchbase/bin/cbepctl localhost:11210 -u Administrator -p password -b 'default' set flush_param exp_pager_stime 10 && date
|
What happens on build 1331:
The command completes at 14:25:57:
cbepctl command output followed by date command |
setting param: exp_pager_stime 10
|
set exp_pager_stime to 10
|
Fri 1 Oct 14:25:57 BST 2021
|
The changes take effect immediately and the first run happens between 14:26:15 and 14:26:30:
Logs from build 1331: https://cb-engineering.s3.amazonaws.com/MB-48716/kv-qe/collectinfo-2021-10-01T141004-ns_1%40127.0.0.1.zip
What happens on build 1332:
The command completes at 15:19:03:
cbepctl command output followed by date command |
setting param: exp_pager_stime 10
|
set exp_pager_stime to 10
|
Fri 1 Oct 15:19:03 BST 2021
|
There is a delay of roughly 9 minutes before the changes take effect as the first run happens between 15:28:15 and 15:28:30:
Logs from build 1332: https://cb-engineering.s3.amazonaws.com/MB-48716/kv-qe/collectinfo-2021-10-01T143535-ns_1%40127.0.0.1.zip
(Hypothesis) What may be the cause:
On builds < 1332:
When "exp_pager_stime" is updated, one of SetExpiryPagerTimerTask or SetExpiryTimerSleepTime is called which creates a new ExpiryItemPager object.
Each ExpiryItemPager has its own semaphore which controls the number of PagingVisitors that can service the ExpiryItemPager.
A PagingVisitor can immediately service the newly created ExpiryItemPager, which explains the behaviour where the changes take effect immediately.
There can be multiple ExpiryItemPager objects (see: MB-41403). As each of them owns a semaphore, there can be more PagingVisitors visiting ExpiryItemPagers than intended.
On builds >= 1332:
In order to control the number of PagingVisitors, an optimisation (see: http://review.couchbase.org/c/kv_engine/+/161944) was added as part of MB-41403, to prevent new ExpiryItemPager objects being created when SetExpiryPagerTimerTask or SetExpiryTimerSleepTime is called.
A side effect of this optimisation results in the updates to the interval taking effect once the previous run completes which can take the time period the previous 'exp_pager_stime' was set to.
Possible impact on QE:
Many expiry tests rely on having fine grained control over when the expiry pager runs.
Attachments
Issue Links
- is caused by
-
MB-41403 Investigate/Improve performance of document expiry throughput
- Closed