You make a very good point Mike, and i had thought about that. However, for certain updates we're going to have to do a background fetch anyway right? Any append/prepend/incr/decr would definitely need a background fetch (as it does anyway today) and unless we're keeping the CAS ID still resident, CAS would. I'm not sure what the plans are to keep any information about the keys resident, but if not, then add and replace would also require finding the key on disk before completing the operation. I think that just about covers all of our operations (on the side, curious how that will impact our expiration capabilities)
Only set() would never require a background fetch and then you have the size of the new item within the operation.
So then the question of removing an old size comes up. Given that we don't do it today with item_alloc_sizes, I'd say it's okay to just let that histogram grow and grow. It's able to be reset now, and that would suffice in the future IMO.
Two other thoughts I had:
-We could keep separate histograms for incoming versus outgoing items. Only needing to update the outgoing when we send data, it would give another approximation of how large the items are in bucket. Biggest downside to that is for write heavy workloads without corresponding reads. I could say here that's sufficient....but you know the first write-heavy customer that I'd like to see the item sizes of will cause a new bug to be opened
-What about taking a totally different approach on this and adding something to our item pager/dispatcher to actually maintain a much more accurate histogram of item sizes? A few approaches to this would be only to deal with items resident in RAM, or to have it part of the ejection process to capture items as they leave, etc. Perhaps something that could be run periodically so you don't ever need to remove sizes from the histogram, just overwrite it with what's there currently...