Details
-
Bug
-
Resolution: Fixed
-
Critical
-
7.1.0
-
This is with a toy-build (simple) containing one
MB-50747fix to collection purging that doesn't look like it could cause this.
http://latestbuilds.service.couchbase.com/builds/latestbuilds/couchbase-server/toybuilds/14408/
-
Untriaged
-
-
1
-
Yes
-
KV 2022-Feb
Description
A reproduction attempt for MB-50747 ended with all collections dropped, but a huge item count.
Analysis of the log files show that vbuckets initially run compaction and purge some collections (only those that have the drop flushed will purge). Compaction can be seen to reschedule but with no further run.
For example from https://cb-engineering.s3.amazonaws.com/MB-50747_repro_on_toy/collectinfo-2022-02-11T131318-ns_1%40172.23.120.201.zip we can see vb:496 (active) follow this pattern.
The first compaction logs and shows 30 collections are erased, with 6720 alive items discarded. Compaction logs a reschedule, but no further run occurs leaving many dropped items on disk.
2022-02-11T04:07:07.980679-08:00 INFO (default) Compaction of vb:496, task:24606, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true created with delay:2147483647s (awaiting completion).
|
2022-02-11T04:07:53.127468-08:00 INFO (default) MagmaKVStore::compactDBInternal: vb:496 purge_before_ts:1644322073 purge_before_seq:0 drop_deletes:false retain_erroneous_tombstones:true
|
2022-02-11T04:07:55.557921-08:00 INFO (default) Compaction of vb:496 done (ok). purged tombstones:0, prepares:0, prepareBytes:0 collection_items_erased:alive:6720,deleted:1590, collections_erased:30, size/items/tombstones/purge_seqno pre{14270464, 22475, 0, 0}, post{10649600, 15755, 0, 0}
|
2022-02-11T04:07:56.277122-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true created with delay:2147483647s (awaiting completion).
|
2022-02-11T04:07:56.760657-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
|
2022-02-11T04:07:59.656673-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
|
2022-02-11T04:08:03.857805-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
|
2022-02-11T04:08:04.876103-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
|
2022-02-11T04:08:06.782053-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
|
2022-02-11T04:08:11.445273-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
|
2022-02-11T04:08:13.146327-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
|
2022-02-11T04:08:14.860463-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
|
2022-02-11T04:08:16.922381-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
|
2022-02-11T04:08:19.159510-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
|
2022-02-11T04:08:25.198287-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
|
2022-02-11T04:08:27.950410-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
|
2022-02-11T04:08:30.860334-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
|
2022-02-11T04:08:33.909172-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
|
2022-02-11T04:21:57.743760-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
|
It initially ran and purged some data, but was then rescheduled and never ran again. mdocs.log shows that it's metadata/local state has plenty of dropped collections just ripe for pruning.
"Path": "/data/default/magma.0/kvstore-496",
|
"kvstore": {
|
"Totals": {
|
"numSSTables": 11,
|
"numStateFiles": 18,
|
"ssTableTotalSize": 11077761
|
},
|
"keyIndex": {
|
"numSSTables": 1,
|
"numStateFiles": 6,
|
"ssTableTotalSize": 565248
|
},
|
"localDocs": [
|
{
|
"Key": "_local/collections/dropped",
|
"Value": {
|
"entries": [
|
{
|
"collectionId": 76,
|
"endSeqno": 36244,
|
"startSeqno": 69
|
},
|
...
|
Attachments
Issue Links
- is duplicated by
-
MB-50956 Compaction started during rebalance got stuck
- Closed
-
MB-50957 Count mismatch between curr_items and collections' num_items
- Closed
-
MB-50963 KV durability tests stuck compacting bucket (7.1.0-2223 -> 2284)
- Closed
-
MB-51046 [system test] FTS: high fragmentation causing disk full in one of the kv node
- Closed
- relates to
-
MB-50988 Rescheduled Compaction tasks do not obey the concurrency limit
- Closed