Details
-
Improvement
-
Resolution: Done
-
Critical
-
Morpheus
-
None
-
1
Description
What's the issue?
At the moment when couchstore looks for a header either at a certain timestamp or a certain seqno it does a linear search. This results in reading lots of headers from disk which in turn slows down the task that it was part of.
What's affected?
So far it is known that DCP backfill (which uses 'seekFirstHeaderContaining') and compaction (which uses 'locateStartHeader') are affected. In the compaction case it means we read every byte of the file that is from inside of the max history age.
What's the fix?
The timestamps should be chronological so we should be able to use a binary search (or at least in one direction).
Proof
I implemented the same behavior in our Go 'couchstore' reader with the following results:
Before |
Read 284922 headers in 46.650695443s
|
{Version:13 NextSeqNo:1 PurgeCounter:0 PurgedDocPointer:0 SeqNoRootSize:17 IDRootSize:28 LocalDocRootSize:12 Timestamp:1645199408328424758 SeqNoRoot:{Position:8317} IDRoot:{Position:8257} LocalDocRoot:{Position:8375}}
|
After |
Read 46 headers in 29.936022ms
|
{Version:13 NextSeqNo:1 PurgeCounter:0 PurgedDocPointer:0 SeqNoRootSize:17 IDRootSize:28 LocalDocRootSize:12 Timestamp:1645199408328424758 SeqNoRoot:{Position:8317} IDRoot:{Position:8257} LocalDocRoot:{Position:8375}}
|
Standard benchmarking procedure was followed e.g. flushing caches.
Attachments
Issue Links
- relates to
-
MB-46854 PiTR: Stream all snapshots
- Closed
For Gerrit Dashboard: MB-51107 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
173316,15 | MB-51107: Make seekFirstHeaderContaining() O(log N) | master | couchstore | Status: MERGED | +2 | +1 |