Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-50941

Compaction reschedules but never runs (sleeping for ~0 seconds?)

    XMLWordPrintable

Details

    Description

      A reproduction attempt for MB-50747 ended with all collections dropped, but a huge item count.

      Analysis of the log files show that vbuckets initially run compaction and purge some collections (only those that have the drop flushed will purge). Compaction can be seen to reschedule but with no further run.

      For example from https://cb-engineering.s3.amazonaws.com/MB-50747_repro_on_toy/collectinfo-2022-02-11T131318-ns_1%40172.23.120.201.zip we can see vb:496 (active) follow this pattern.

      The first compaction logs and shows 30 collections are erased, with 6720 alive items discarded. Compaction logs a reschedule, but no further run occurs leaving many dropped items on disk.

      2022-02-11T04:07:07.980679-08:00 INFO (default) Compaction of vb:496, task:24606, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true created with delay:2147483647s (awaiting completion).
      2022-02-11T04:07:53.127468-08:00 INFO (default) MagmaKVStore::compactDBInternal: vb:496 purge_before_ts:1644322073 purge_before_seq:0 drop_deletes:false retain_erroneous_tombstones:true
      2022-02-11T04:07:55.557921-08:00 INFO (default) Compaction of vb:496 done (ok). purged tombstones:0, prepares:0, prepareBytes:0 collection_items_erased:alive:6720,deleted:1590, collections_erased:30, size/items/tombstones/purge_seqno pre{14270464, 22475, 0, 0}, post{10649600, 15755, 0, 0}
      2022-02-11T04:07:56.277122-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true created with delay:2147483647s (awaiting completion).
      2022-02-11T04:07:56.760657-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
      2022-02-11T04:07:59.656673-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
      2022-02-11T04:08:03.857805-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
      2022-02-11T04:08:04.876103-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
      2022-02-11T04:08:06.782053-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
      2022-02-11T04:08:11.445273-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
      2022-02-11T04:08:13.146327-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
      2022-02-11T04:08:14.860463-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
      2022-02-11T04:08:16.922381-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
      2022-02-11T04:08:19.159510-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
      2022-02-11T04:08:25.198287-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
      2022-02-11T04:08:27.950410-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
      2022-02-11T04:08:30.860334-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
      2022-02-11T04:08:33.909172-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
      2022-02-11T04:21:57.743760-08:00 INFO (default) Compaction of vb:496, task:24826, purge_before_ts:0, purge_before_seq:0, drop_deletes:false, internal:true rescheduled with delay:2147483647s (awaiting completion).
      

      It initially ran and purged some data, but was then rescheduled and never ran again. mdocs.log shows that it's metadata/local state has plenty of dropped collections just ripe for pruning.

                  "Path": "/data/default/magma.0/kvstore-496",
                  "kvstore": {
                      "Totals": {
                          "numSSTables": 11,
                          "numStateFiles": 18,
                          "ssTableTotalSize": 11077761
                      },
                      "keyIndex": {
                          "numSSTables": 1,
                          "numStateFiles": 6,
                          "ssTableTotalSize": 565248
                      },
                      "localDocs": [
                          {
                              "Key": "_local/collections/dropped",
                              "Value": {
                                  "entries": [
                                      {
                                          "collectionId": 76,
                                          "endSeqno": 36244,
                                          "startSeqno": 69
                                      },
      ...
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              Balakumaran.Gopal Balakumaran Gopal
              jwalker Jim Walker
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty