Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-6119

"item_alloc_sizes" not correct when using append

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: 1.8.1
    • Fix Version/s: bug-backlog
    • Component/s: couchbase-bucket
    • Security Level: Public
    • Labels:
      None
    • Triage:
      Untriaged

      Description

      Setup a test to constantly append to an item until it reached 20mb, yet the output of timings for item_alloc_sizes only showed 32-64k. I presume this is because it only tracks the incoming value size, but that doesn't work for append/prepend/incr/decr.

      Could this be extended to actually get the size of the value for those specific operations or will that be too invasive?

      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        mikew Mike Wiederhold added a comment -

        So if you want the stat there is one big issue. Let's say you want to do a set and then an update. In order to keep the stat correct we would want to remove the size of the original item from the histogram and add the new size. This isn't really a big deal right now, but in the near future we plan on evicting all of the meta data from memory. If we do this then that means that we wouldn't keep any information on the value or its size in memory for an evicted key. As a result this means that an update for an evicted key would require us to do a background fetch in order to keep this histogram correct. As a result I don't think it would make sense for us to have such a histogram. Let me know what your thoughts are here.

        Show
        mikew Mike Wiederhold added a comment - So if you want the stat there is one big issue. Let's say you want to do a set and then an update. In order to keep the stat correct we would want to remove the size of the original item from the histogram and add the new size. This isn't really a big deal right now, but in the near future we plan on evicting all of the meta data from memory. If we do this then that means that we wouldn't keep any information on the value or its size in memory for an evicted key. As a result this means that an update for an evicted key would require us to do a background fetch in order to keep this histogram correct. As a result I don't think it would make sense for us to have such a histogram. Let me know what your thoughts are here.
        Hide
        perry Perry Krug added a comment -

        You make a very good point Mike, and i had thought about that. However, for certain updates we're going to have to do a background fetch anyway right? Any append/prepend/incr/decr would definitely need a background fetch (as it does anyway today) and unless we're keeping the CAS ID still resident, CAS would. I'm not sure what the plans are to keep any information about the keys resident, but if not, then add and replace would also require finding the key on disk before completing the operation. I think that just about covers all of our operations (on the side, curious how that will impact our expiration capabilities)

        Only set() would never require a background fetch and then you have the size of the new item within the operation.

        So then the question of removing an old size comes up. Given that we don't do it today with item_alloc_sizes, I'd say it's okay to just let that histogram grow and grow. It's able to be reset now, and that would suffice in the future IMO.

        Two other thoughts I had:
        -We could keep separate histograms for incoming versus outgoing items. Only needing to update the outgoing when we send data, it would give another approximation of how large the items are in bucket. Biggest downside to that is for write heavy workloads without corresponding reads. I could say here that's sufficient....but you know the first write-heavy customer that I'd like to see the item sizes of will cause a new bug to be opened
        -What about taking a totally different approach on this and adding something to our item pager/dispatcher to actually maintain a much more accurate histogram of item sizes? A few approaches to this would be only to deal with items resident in RAM, or to have it part of the ejection process to capture items as they leave, etc. Perhaps something that could be run periodically so you don't ever need to remove sizes from the histogram, just overwrite it with what's there currently...

        Thoughts?

        Show
        perry Perry Krug added a comment - You make a very good point Mike, and i had thought about that. However, for certain updates we're going to have to do a background fetch anyway right? Any append/prepend/incr/decr would definitely need a background fetch (as it does anyway today) and unless we're keeping the CAS ID still resident, CAS would. I'm not sure what the plans are to keep any information about the keys resident, but if not, then add and replace would also require finding the key on disk before completing the operation. I think that just about covers all of our operations (on the side, curious how that will impact our expiration capabilities) Only set() would never require a background fetch and then you have the size of the new item within the operation. So then the question of removing an old size comes up. Given that we don't do it today with item_alloc_sizes, I'd say it's okay to just let that histogram grow and grow. It's able to be reset now, and that would suffice in the future IMO. Two other thoughts I had: -We could keep separate histograms for incoming versus outgoing items. Only needing to update the outgoing when we send data, it would give another approximation of how large the items are in bucket. Biggest downside to that is for write heavy workloads without corresponding reads. I could say here that's sufficient....but you know the first write-heavy customer that I'd like to see the item sizes of will cause a new bug to be opened -What about taking a totally different approach on this and adding something to our item pager/dispatcher to actually maintain a much more accurate histogram of item sizes? A few approaches to this would be only to deal with items resident in RAM, or to have it part of the ejection process to capture items as they leave, etc. Perhaps something that could be run periodically so you don't ever need to remove sizes from the histogram, just overwrite it with what's there currently... Thoughts?
        Hide
        maria Maria McDuff (Inactive) added a comment -

        Mike/Perry,

        is this a must-fix/have for the next release?
        pls update this bug.

        Show
        maria Maria McDuff (Inactive) added a comment - Mike/Perry, is this a must-fix/have for the next release? pls update this bug.
        Hide
        perry Perry Krug added a comment -

        This can be deferred.

        Show
        perry Perry Krug added a comment - This can be deferred.
        Hide
        maria Maria McDuff (Inactive) added a comment -

        Post 3.0

        Show
        maria Maria McDuff (Inactive) added a comment - Post 3.0

          People

          • Assignee:
            Unassigned
            Reporter:
            perry Perry Krug
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes