Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Done
-
master
-
Untriaged
-
Yes
Description
As reported by http://showfast.sc.couchbase.com/daily/#/history/KV%7CPillowfight,%2020/80%20R/W,%20256B%20binary%20items%7CMax%20Throughput%20(ops/sec) :
Appears to have occurred between builds 6.5.0-1545 and 6.5.0-1561
Build | Throughput |
---|---|
6.5.0-1561 | 1,916,444 |
6.5.0-1545 | 2,009,289 |
Attachments
Issue Links
- relates to
-
MB-32389 10% drop in KV throughput over time
-
- Closed
-
For Gerrit Dashboard: MB-32107 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
103678,2 | MB-32107: Create toy builds before/after regression | master | manifest | Status: MERGED | +2 | +1 |
103899,10 | MB-32107: Resolve performance regression | master | kv_engine | Status: ABANDONED | 0 | +1 |
104193,10 | MB-32107: Reduce false sharing of cache lines in DcpProducer | master | kv_engine | Status: MERGED | +2 | +1 |
104255,5 | MB-32107: Reduce cache line contention by moving DcpProducer acquisition | master | kv_engine | Status: MERGED | +2 | +1 |
104256,5 | MB-32107: Remove recursive nextCheckpointItemTask call in ActiveStream | master | kv_engine | Status: MERGED | +2 | +1 |
104814,2 | MB-32107: Add MH 2202 toy to identify possible regression | master | manifest | Status: MERGED | +2 | +1 |
105162,2 | MB-32107: Update manifests for perf | master | manifest | Status: MERGED | +2 | +1 |
Tested using build 2384. This cache line is contentious.
-------------------------------------------------------------
1 1209 839 1863 0 0x7cec80
-------------------------------------------------------------
96.77% 97.74% 0.00% 0.00% 0x30 0x48fcbd 296 145 161 2625 40 [.] update_topkeys memcached unique_ptr.h:147 0{40 56.3% n/a} 1{40 43.7% n/a}
3.23% 2.26% 100.00% 0.00% 0x38 0x48e1c0 402 182 225 5269 40 [.] mcbp_add_header memcached atomic_base.h:514 0{40 63.8% 57.8%} 1{40 36.2% 42.2%}
The issue is in Bucket.
(gdb) pahole Bucket
/* 179352 */ struct Bucket {
/* 0 40 */ std::mutex mutex
/* 40 48 */ std::condition_variable cond
/* 88 4 */ unsigned int clients
/* 92 1 */ std::atomic<Bucket::State> state
/* 93 1 */ Bucket::Type type
/* 94 101 */ char [101] name
/* XXX 40 bit hole, try to pack */
/* 200 96 */ std::array<std::vector<engine_event_handler, std::allocator<engine_event_handler> >, 4> engine_event_handlers
/* 296 24 */ std::vector<thread_stats, std::allocator<thread_stats> > stats
/* 320 176504 */ Timings timings
/* 176824 672 */ TimingHistogram subjson_operation_times
/* 177496 8 */ std::unique_ptr<TopKeys, std::default_delete<TopKeys> > topkeys
/* 177504 1704 */ std::array<Couchbase::RelaxedAtomic<unsigned long>, 213> responseCounters
/* 179208 64 */ ClusterConfiguration clusterConfiguration
/* 179272 8 */ unsigned long max_document_size
/* 179280 56 */ std::unordered_set<cb::engine::Feature, std::hash<cb::engine::Feature>, std::equal_to<cb::engine::Feature>, std::allocator<cb::engine::Feature> > supportedFeatures
/* 179336 8 */ EngineIface * engine
/* 179344 8 */ DcpIface * bucketDcp
}
Not printing cache lines because timings is ludicrously big, but the topkeys unique ptr shares a cache line with the responseCounters array. The responseCounters array is full of atomics that are used to track number of responses of a certain type. The first (and almost certainly the hottest) is SUCCESS. This is sharing with topkeys and as topkeys is only read in the sample, this is certainly false sharing. Will put this just after name. This will share a cache line with the end of name and the start of engine_event_handlers.
(gdb) pahole Bucket
/* 179352 */ struct Bucket {
/* 0 40 */ std::mutex mutex
/* 40 48 */ std::condition_variable cond
/* 88 4 */ unsigned int clients
/* 92 1 */ std::atomic<Bucket::State> state
/* 93 1 */ Bucket::Type type
/* 94 101 */ char [101] name
/* XXX 40 bit hole, try to pack */
/* 200 8 */ std::unique_ptr<TopKeys, std::default_delete<TopKeys> > topkeys
/* 208 96 */ std::array<std::vector<engine_event_handler, std::allocator<engine_event_handler> >, 4> engine_event_handlers
/* 304 24 */ std::vector<thread_stats, std::allocator<thread_stats> > stats
/* 328 176504 */ Timings timings
/* 176832 672 */ TimingHistogram subjson_operation_times
/* 177504 1704 */ std::array<Couchbase::RelaxedAtomic<unsigned long>, 213> responseCounters
/* 179208 64 */ ClusterConfiguration clusterConfiguration
/* 179272 8 */ unsigned long max_document_size
/* 179280 56 */ std::unordered_set<cb::engine::Feature, std::hash<cb::engine::Feature>, std::equal_to<cb::engine::Feature>, std::allocator<cb::engine::Feature> > supportedFeatures
/* 179336 8 */ EngineIface * engine
/* 179344 8 */ DcpIface * bucketDcp
}