Details
-
Bug
-
Resolution: Fixed
-
Major
-
5.5.0
-
Untriaged
-
Centos 64-bit
-
Yes
Description
Several tests indicate problems with initial data load. Clients keeps receiving ep_tmp_oom_errors even in absence of large persistence and replication queues. It looks like kv-engine cannot evict items promptly - ep_num_value_ejects counter literally freezes for several minutes.
Let's use the following test case as an example:
- 2 nodes
- 1 bucket (full ejection)
- 100M items
I stopped the clients after the first TMP OOM error at 15:06:50 and left the system running.
I can see that one of three non-IO threads is constantly busy (100% CPU) and ep_num_eject_failures counter keeps increasing. Once in a while items get ejected.
Logs, perf profile for non-IO thread, and some graphs from mortimer are attached.
Attachments
Issue Links
- relates to
-
MB-22010 Enhance fidelity of ep-engine's LRU
- Closed
For Gerrit Dashboard: MB-28047 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
89662,7 | MB-28047: Change the update interval to be percent of items in HT | master | kv_engine | Status: MERGED | +2 | +1 |
89672,10 | MB-28047: Update the memory recover target after visiting each vbucket | master | kv_engine | Status: MERGED | +2 | +1 |
89736,6 | MB-28047: Remove unrequired checkspoints immediately after eviction | master | kv_engine | Status: MERGED | +2 | +1 |
89742,5 | MB-22010: Change the default eviction policy to statistical_counter | master | kv_engine | Status: MERGED | +2 | +1 |
89744,1 | MB-28047: Extra Debug | master | kv_engine | Status: ABANDONED | 0 | +1 |
89803,6 | MB-28047: Bias the eviction histogram for items that cannot be evicted | master | kv_engine | Status: MERGED | +2 | +1 |
89805,6 | MB-28047: Use correct types for HdrHistogram_c functions | master | kv_engine | Status: MERGED | +2 | +1 |