Details
-
Improvement
-
Resolution: Unresolved
-
Major
-
Cheshire-Cat
-
None
-
1
Description
As highlighted by MB-44452; deleting a Bucket is currently an O(N) operation, where N is the number of in-memory items in a Bucket (for a non-trivial N).
This is because Bucket deletion requires deleting all VBucket objects, which requires deleting all HashTable objects, which requires individually deleting every StoredValue in the HashTable.
However, in the general case of deleting an entire bucket, we can in theory optimise this. We use a single jemalloc arena for each Bucket, and once the Bucket is deleted the arena should be empty. We should be able to skip all the costly deallocation of individual allocations and simply delete the arena. We currently re-use arenas and never delete them, but it is possible in principle.
This would also require careful resource management - we'd essentially have to skip all the normal dtors which own memory (the primary example being HashTable) - but ensure any non-trivial resources (blob shared pointers, mutexes, file descriptors, backgound Tasks, etc) are still correctly deleted.
Some other possible improvements could involve introducing a fast-path for bucket deletion, where we skip updating certain counters.
- When HT destructor runs, we iterate all SVs and update collection metrics. In the case of a full bucket delete, we might be able to skip this
- We might be able to multi-thread the SV iteration and destruction
Attachments
Issue Links
- relates to
-
MB-44452 [couchstore]:Graceful Failover -> Full Recovery -> Rebalance failed due to buckets_shutdown_wait_failed
- Closed