Details
-
Bug
-
Resolution: Fixed
-
Major
-
3.0
-
Security Level: Public
-
None
-
Observed on many.
-
Untriaged
-
Unknown
Description
Whilst investigating rebalance performance (time, disk and net I/O) I was curious as to why a 100% resident rebalance was reading from disk (Linux disk caches dropped prior to rebalance).
Found DCP backfill performs a call to retrieve the number of items in a couchstore data file followed by a "dump" of the file. These are two open/closes, the first one looks to be removable.
Given that the dump of seqno -> end is going to execute the CacheCallback for each K, we can use the CacheCallback to also increment "backfill-remaining"
This appears a safe thing todo, I can't see that the dump/scan requires the backfill count? If it does then we can scrap this simple fix and i'll move along to the next set of tests and investigations
With the patch it was hard to see if there's a perf improvement, but we do reduce the amount of data read from disk. The clearest stat was measuring how many pread64's a node performed during rebalance. Average of 2 runs shows the following.
1. current master - 28,902 calls to pread64
2. With patch - 24,170 calls to pread64
That's 16% less. It's gotta be useful?