Details
-
Bug
-
Resolution: Fixed
-
Major
-
4.6.0
-
Untriaged
-
Centos 64-bit
-
Yes
Description
All cbc-pillowfight cases (read-heavy and mixed, 512 and 1024 bytes) indicate a significant drop in KV throughput after these changes:
CHANGELOG for ep-engine
|
|
* Commit: 66882e8dcd1c0952dc63f6608c04d6b29c39c86c (in build: 3441)
|
Author: Dave Rigby
|
MB-20852 [17/N]: Serialize VB state changes
|
|
Background/Problem:
|
|
MB-20852 exposed an issue with how VBucket state was persisted to disk
|
- specifically that state can be persisted to disk out of order, and
|
intermediate state changes could be dropped (not persisted at all).
|
|
These problems are caused by the asynchronous tasks which are
|
responsible for persisting state to disk -
|
VBStatePersistTask{Low,High}. There are essentially two interrelated
|
issues:
|
|
1. We allow multiple tasks which persist VBState to exist concurrently
|
(VBStatePersistTask{Low,High}, Flusher,
|
VBSnapshotTask{Low,High}. Furthermore, multiple instances of the
|
same task (e.g. VBStatePersistTask) can also exist concurrenlty.
|
|
2. We do not maintain a queue of states to persist, instead we just
|
mark that a given vbid's state is pending.
|
|
(1) Can occur when a VBPersistTask has started to run on a BG writer
|
thread (and has cleared the schedule_vbstate_persist flag), but
|
then another scheduleVBStatePersist() call is made. As the
|
schedule_vbstate_persist flag is clear, this second task is
|
allowed to be created. There is then no guarantee which task will
|
complete first (they could be scheduled to different OS writer
|
threads, the first thread could be descheduled and the second one
|
then runs and completes first).
|
|
We could attempt to solve this by changing when
|
schedule_vbstate_persist is cleared (say move it later in the
|
persistVBState() function), but then the inverse problem is
|
exposed - we may fail to schedule a second (different) state to be
|
persisted if the current task is just finishing up (and will exit
|
without persisting the now-outstanding work).
|
|
(2) Presents a subtle problem relating to when the state of a VBucket
|
is materialized. As we only record the vbid to persist (and not
|
the state), by the time the VBucketPersistTask runs the actual
|
state we /wanted/ to write may have moved forward. Even worse, the
|
state could have "gone backwards" (as shown in the MB) if the
|
state of the VBucket is read before the vbucket is deleted,
|
meaning the task has a 'stale' view of the VBucket object (due to
|
us using ref-counted pointers for VBucket objects).
|
|
Additionally, not persisting all intermediate states makes
|
debugging harder. We don't actually change VBucket state /that/
|
often, and so having all the intermediate states a VBucket went
|
through on disk is extremely valuable in debugging.
|
|
Solution:
|
|
Instead of having multiple different tasks which can persist state
|
(and attempting to manage when they are created / when they run / what
|
state they persist), we instead use a single task (the Flusher task)
|
to persist state for a given vBucket. Note the Flusher *already*
|
persists vbucket state if necessary during commit (see
|
EventuallyPersistentStore::flushVBucket), so this path just adopts the
|
Flusher as the canonical Task to perform vbstate persistence.
|
|
To ensure that state is persisted even when there are no outstanding
|
'normal' items in the vbucket's checkpoint queue, we use the new
|
queue_op::set_vbucket_state meta-item to signify that a persist is
|
pending (and to essentially make the pending items queue non-empty so
|
the flusher will run).
|
|
After this patch all the other Tasks which used to persist vbucket
|
state are redundent - a subsequent patch will remove them.
|
|
Change-Id: Ic44360a1dd14fb882ebfa6f28f4ccfe6127d17a8
|
Reviewed-on: http://review.couchbase.org/69150
|
Reviewed-by: Jim Walker
|
Tested-by: buildbot
|
|
|
* Commit: 941c24b6fb201686a445503c566c8fc9a865a5b7 (in build: 3441)
|
Author: Dave Rigby
|
MB-20852 [13/N]: Checkpoint: Add getNumMetaItems() method
|
|
Add a new method to Checkpoint which returns the number of metaItems
|
in the checkpoint.
|
|
Initially this just returns a fixed value of 1 if the checkpoint is
|
open, or 2 if closed, as this matches the current checkpoint meta item
|
usage.
|
|
However subsequent patches will modify this to track the count of how
|
many meta items actually are in the checkpoint and hence allow us to
|
determine the meta item count when n queue_op::set_vbstate is added.
|
|
Change-Id: I411f2e97e16f9b11ac19a1b7165e4767d09f37d1
|
Reviewed-on: http://review.couchbase.org/69018
|
Reviewed-by: Trond Norbye
|
Tested-by: buildbot
|
|
|
* Commit: bed33dbde2a2acc56ebadac39b61c53c1ddbdcb2 (in build: 3441)
|
Author: Dave Rigby
|
MB-20852 [14/N]: Improve debug/logging in CheckpointManager
|
|
Include information on the CheckpointCursors associated with a
|
CheckpointManager by adding a operator<< for CheckpointCursor. Add a
|
dump() method to CheckpointManager to assist in debugging their
|
contents (e.g. from gdb).
|
|
Also add some additional CheckpointManager/CouchKVStore logging.
|
|
Change-Id: I8c3de5b5ec0e8e297db8530dee87ac0edd869a91
|
Reviewed-on: http://review.couchbase.org/69021
|
Reviewed-by: Trond Norbye
|
Tested-by: buildbot
|
|
|
* Commit: 3fa4aa0a7a27067fa55fa4658d2a8a126321829e (in build: 3441)
|
Author: Dave Rigby
|
MB-20852 [16/N]: Add queue_op::set_vbucket_state meta-item
|
|
Add a new meta-item to queue_op enum - set_vbucket_state. This will be
|
used to mark that a VBucket should have it's state persisted.
|
|
Change-Id: I108befd70933962d262529ee230382e47c64e31a
|
Reviewed-on: http://review.couchbase.org/69149
|
Reviewed-by: Trond Norbye
|
Tested-by: buildbot
|
|
|
* Commit: ba75d06f48f29e85c9eda66434ce70aa36b81471 (in build: 3441)
|
Author: Dave Rigby
|
MB-20852 [15/N]: Accurately track meta items within checkpoints
|
|
Instead of assuming that a Checkpoint only contains 1 (Open) or 2
|
(Closed) meta-items, maintain a count of items within each Checkpoint,
|
and track how many meta-items a CheckpointCursor has read.
|
|
This allows us to support an arbitrary number of meta-items within a
|
Checkpoint, and in any sequence. This feature will be used to add
|
support for set_vbstate meta-items in a subsquent patches.
|
|
Change-Id: I8fb3040cbe64e316aae1f693afee65001b2b4b17
|
Reviewed-on: http://review.couchbase.org/69020
|
Reviewed-by: Trond Norbye
|
Tested-by: buildbot
|
|
An example of workload:
cbc-pillowfight --spec couchbase://172.23.96.117/bucket-1 --password password --batch-size 1000 --num-items 1000000 --num-threads 50 --num-cycles 40000 --min-size 512 --max-size 512 --set-pct 20
|
It should be easy to navigate to the graphs and logs from ShowFast.
Good run (build 3440)
- Jenkins: http://perf.jenkins.couchbase.com/job/hera/197/
- cbcollect: https://s3-us-west-2.amazonaws.com/perf-artifacts/jenkins-hera-197/172.23.96.117.zip
Bad run (build 3441)