Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-7504

Eviction, Ejection, working set management

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 2.0
    • Fix Version/s: None
    • Component/s: None
    • Security Level: Public
    • Labels:
      None

      Description

      Need more content here:

      http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-introduction-architecture-eviction.html

      -Describe new logic for eviction as of 2.0
      -Tie into Access Logs topic
      -Any command line which impacts this behavior

      Input from Liang:

      -ejection vs eviction: eviction a concept from memcached, it means the entire items (key, metadata and value) removed from RAM. Ejection is what we do now for 1.8 and 2.0: the value is removed but the keys and metadata still in memory
      -warmup process - two iterations occur: firs tkeys and metadata loaded, then values for keys in access log. See info added on disk warmup on access scanner.
      -expiry pager: remove expired items from memory. A disk cleanup process
      -item pager: eject items if high water mark reached, eject hose that are not dirty or need to be replicated to node in cluster. High water mark configurable

      -item pager: looks at NRU or if high water still breached, eject random, use % active vs replica. Also configurable via cbepctl
      -goal improve memory efficiency, maintain balance of active/replica data.

      defaults ejection % 40% for active with 50% cap, One parameter - adjust two items: active/replica-as-remainder

      goal reach low water mark, but if under HWM, timer pager stops. if not repeats until under.
      -NRU - replicated. if mutation on item, set to true. this true value replicated to other node in cluster. 1 bit.
      -access scanner - adds all NRU true to access log, then sets all items to false after pass, thereby influencing item pager
      -(percentage memory to be free) = (total memory - lwm)/ total memory

      -config hwm at each node? Must be reset after node failure. Config are in memory on node.
      -high active ratio reduce cache miss,
      -too low replica% - if replica node promoted, will experience cache misses - post rebalance, post rebalance

      Input from Jin on Configurations:

      -Hi, you can change the default water marks by "flush_param [mem_high_wat/mem_low_wat] value" via cbepctl.

      You can change the default percentage of active vbuckets being ejected items by item pager by
      "flush_param pager_active_vb_pcnt value" where value is btwn 0 and 50%.

      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        kzeller kzeller added a comment -

        You can add them here. Do please include the section # and paragraph #.

        Show
        kzeller kzeller added a comment - You can add them here. Do please include the section # and paragraph #.
        Hide
        perry Perry Krug added a comment -

        Eviction/Ejection/Working set:
        Page 9: I would suggest that "Data in RAM" should be changed to "Built-in Caching layer"…the description below really doesn't describe anything about what we do with Data in RAM
        Page 9: First two paragraphs are redundant
        Page 9: First sentence of 2nd prograph under "Data in RAM" is not a complete sentence
        Page 9: We cannot recommend the "maximum amount of RAM" to be allocated…need to revise
        Page 9: Last sentence under "Data in RAM" - "kept" should be "keep"
        Page 9: Eviction/Ejection/Working set: The first two paragraphs are fairly redundant, though not completely
        Page 9: The 2nd to last paragraph is very wrong, implies that occasionally we "evict" data…that never happens and is not part of the software. LRU is for eviction in memcached buckets, NRU is for ejection in Couchbase buckets
        Page 76: First sentence is confusing and possibly incorrect. This is not "all" the data available for read- and write- access. Firstly, there is data not in RAM that is available to be read, and secondly, a "write" can occur to data that doesn't even exist…so how can it be in RAM?
        Page 76: 2nd paragraph: The metadata and keys are also kept in RAM for many other reasons. Very fast "miss" access is extremely important to some customers, especially for add() operations. And our expiration process uses the metadata in RAM to quickly scan for items that are already expired.
        Page 76: 3rd paragraph. I don't think this description is correct. Once the RAM hits the low watermark, some replica data is immediately ejected as it is written to disk (confirm with Chiyoung). Once the RAM hits the high water mark, a percentage of active and replica data is ejected until the RAM usage hits the low water mark (not just gets below the high). The watermarks are NOT expressed as percentages, but rather absolute byte values. Just because we can set them as percentages does not mean they are stored or represented that way. Why do we call out only the "low water mark" as viewable as a server statistic?
        Page 76: You say that it ejects randomly…and then say that it is not completely random.
        Page 76: Note that the percentages are defaults and can be changed…link to where they can be changed?
        Page 76: The description of who sets the NRU bit to true/false is backwards
        Page 76: Seems to be missing a lot of useful descriptive information from: http://hub.internal.couchbase.com/confluence/display/cbeng/EP-Engine+Item+Pager+2.0
        Page 76: Links to "Handling Server Warmup" are duplicated
        Page 76: Why is "Handling Server Warmup" related to change ejection settings?
        Page 77: Why repeat the first few sentences under "Changing thresholds"
        Page 77: As per bug MB-7549, please make sure that cpepctl examples include a bucket name
        Page 76/77: We need more decription about the effect of changing the high and low watermarks. Espeically guidance on how far apart to set them since this can be a major problem if users are tweaking them. We should really note that we DO NOT recommend changing these values from the default unless required. Give a customer a gun, and they WILL shoot themselves in the foot with it…the least we can do is warn
        Page 77: Under 5.3.2…the "true versus false" description of NRU is incorrect.
        Page 77: We do not have active "nodes" nor replica "nodes"…this implies an incorrect connotation (the whole two paragraphs under 5.3.2 should be reviewed for this)
        Page 77: Typoe: "However, if have the server…" should be "…if you have the server…"
        Page 77: The warning at the end of 5.3.2 is EXCELLENT. We should strive to have a note/warning like this on every single setting that can be changed. Our goal is to make Couchbase Server perfectly functional out of the box with no changes…anything that can be changed will have consequences and we should call those out.

        Warmup feedback
        Page 72: Why do we refer back to the section were already in (3rd paragraph)
        Page 73: What does an ENGINE_TMPFAIL (0x0d) mean to a client? To an application?
        Page 73: Under 5.1.1, these examples for cbstats are incorrect
        Page 74: The note should specify that it is per-bucket per-node

        Show
        perry Perry Krug added a comment - Eviction/Ejection/Working set: Page 9: I would suggest that "Data in RAM" should be changed to "Built-in Caching layer"…the description below really doesn't describe anything about what we do with Data in RAM Page 9: First two paragraphs are redundant Page 9: First sentence of 2nd prograph under "Data in RAM" is not a complete sentence Page 9: We cannot recommend the "maximum amount of RAM" to be allocated…need to revise Page 9: Last sentence under "Data in RAM" - "kept" should be "keep" Page 9: Eviction/Ejection/Working set: The first two paragraphs are fairly redundant, though not completely Page 9: The 2nd to last paragraph is very wrong, implies that occasionally we "evict" data…that never happens and is not part of the software. LRU is for eviction in memcached buckets, NRU is for ejection in Couchbase buckets Page 76: First sentence is confusing and possibly incorrect. This is not "all" the data available for read- and write- access. Firstly, there is data not in RAM that is available to be read, and secondly, a "write" can occur to data that doesn't even exist…so how can it be in RAM? Page 76: 2nd paragraph: The metadata and keys are also kept in RAM for many other reasons. Very fast "miss" access is extremely important to some customers, especially for add() operations. And our expiration process uses the metadata in RAM to quickly scan for items that are already expired. Page 76: 3rd paragraph. I don't think this description is correct. Once the RAM hits the low watermark, some replica data is immediately ejected as it is written to disk (confirm with Chiyoung). Once the RAM hits the high water mark, a percentage of active and replica data is ejected until the RAM usage hits the low water mark (not just gets below the high). The watermarks are NOT expressed as percentages, but rather absolute byte values. Just because we can set them as percentages does not mean they are stored or represented that way. Why do we call out only the "low water mark" as viewable as a server statistic? Page 76: You say that it ejects randomly…and then say that it is not completely random. Page 76: Note that the percentages are defaults and can be changed…link to where they can be changed? Page 76: The description of who sets the NRU bit to true/false is backwards Page 76: Seems to be missing a lot of useful descriptive information from: http://hub.internal.couchbase.com/confluence/display/cbeng/EP-Engine+Item+Pager+2.0 Page 76: Links to "Handling Server Warmup" are duplicated Page 76: Why is "Handling Server Warmup" related to change ejection settings? Page 77: Why repeat the first few sentences under "Changing thresholds" Page 77: As per bug MB-7549 , please make sure that cpepctl examples include a bucket name Page 76/77: We need more decription about the effect of changing the high and low watermarks. Espeically guidance on how far apart to set them since this can be a major problem if users are tweaking them. We should really note that we DO NOT recommend changing these values from the default unless required. Give a customer a gun, and they WILL shoot themselves in the foot with it…the least we can do is warn Page 77: Under 5.3.2…the "true versus false" description of NRU is incorrect. Page 77: We do not have active "nodes" nor replica "nodes"…this implies an incorrect connotation (the whole two paragraphs under 5.3.2 should be reviewed for this) Page 77: Typoe: "However, if have the server…" should be "…if you have the server…" Page 77: The warning at the end of 5.3.2 is EXCELLENT. We should strive to have a note/warning like this on every single setting that can be changed. Our goal is to make Couchbase Server perfectly functional out of the box with no changes…anything that can be changed will have consequences and we should call those out. Warmup feedback Page 72: Why do we refer back to the section were already in (3rd paragraph) Page 73: What does an ENGINE_TMPFAIL (0x0d) mean to a client? To an application? Page 73: Under 5.1.1, these examples for cbstats are incorrect Page 74: The note should specify that it is per-bucket per-node
        Hide
        kzeller kzeller added a comment -

        From Jin:

        Sorry for the delay, I reviewed it a while ago but forgot to send you an email. My apology.
        Agagin overall all looks great!

        Thanks much,
        Jin

        Page 9:
        1st and 2nd paragraphs seem to basically describe the exactly same thing. I wonder if this was your intention.
        In Couchbase 2.0. there is no eviction at all. Ejection is only internal mechanism to free RAM in 2.0 and beyond.

        Page 76:
        We may want to hide or tone down d
        Detailed explanation about the ejection algorithm of item pager in the following paragraph, we may want to hide or tone down. Simply because (being politically incorrect) it is not such a good algorithm (too simple to describe as a NRU algorithm - no access pattern awareness, etc)

        "Note that there are two processes which change the NRU for an item…."

        Page 77:
        I believe allowing a new low water mark that is higher than high water mark is wrong. And, I just checked ep engine currently doesn't enforce users not to do that. Which I will open a separate bug to address. Anyhow, can you please change the example of setting mem_low_wat to "65" instead of "75"?
        In the first paragraph in 5.3.2 there is a typo "note recently used" which should be "not recently used".
        Don't need to mention whether ejecting items are on a server node, each node does the exactly same things ejecting first nru=true items and randomly picked active and replica items on it.

        Show
        kzeller kzeller added a comment - From Jin: Sorry for the delay, I reviewed it a while ago but forgot to send you an email. My apology. Agagin overall all looks great! Thanks much, Jin Page 9: 1st and 2nd paragraphs seem to basically describe the exactly same thing. I wonder if this was your intention. In Couchbase 2.0. there is no eviction at all. Ejection is only internal mechanism to free RAM in 2.0 and beyond. Page 76: We may want to hide or tone down d Detailed explanation about the ejection algorithm of item pager in the following paragraph, we may want to hide or tone down. Simply because (being politically incorrect) it is not such a good algorithm (too simple to describe as a NRU algorithm - no access pattern awareness, etc) "Note that there are two processes which change the NRU for an item…." Page 77: I believe allowing a new low water mark that is higher than high water mark is wrong. And, I just checked ep engine currently doesn't enforce users not to do that. Which I will open a separate bug to address. Anyhow, can you please change the example of setting mem_low_wat to "65" instead of "75"? In the first paragraph in 5.3.2 there is a typo "note recently used" which should be "not recently used". Don't need to mention whether ejecting items are on a server node, each node does the exactly same things ejecting first nru=true items and randomly picked active and replica items on it.
        Hide
        kzeller kzeller added a comment -

        Fixed and pushed:

        -remore repeats
        -fix HWM and LWM info
        -add warning about changing defaults
        -change to replica data and data written to node vs "replica node/active node"
        -add back eviction of data if only memcached buckets....
        -add named bucket to examples

        Note: Chiyoung confirms LWM HWM info/warning, notes we will have guidance for 2.0.1

        Show
        kzeller kzeller added a comment - Fixed and pushed: -remore repeats -fix HWM and LWM info -add warning about changing defaults -change to replica data and data written to node vs "replica node/active node" -add back eviction of data if only memcached buckets.... -add named bucket to examples Note: Chiyoung confirms LWM HWM info/warning, notes we will have guidance for 2.0.1
        Hide
        kzeller kzeller added a comment -

        Fixed and pushed:

        -remore repeats
        -fix HWM and LWM info
        -add warning about changing defaults
        -change to replica data and data written to node vs "replica node/active node"
        -add back eviction of data if only memcached buckets....
        -add named bucket to examples

        Note: Chiyoung confirms LWM HWM info/warning, notes we will have guidance for 2.0.1

        Show
        kzeller kzeller added a comment - Fixed and pushed: -remore repeats -fix HWM and LWM info -add warning about changing defaults -change to replica data and data written to node vs "replica node/active node" -add back eviction of data if only memcached buckets.... -add named bucket to examples Note: Chiyoung confirms LWM HWM info/warning, notes we will have guidance for 2.0.1

          People

          • Assignee:
            kzeller kzeller
            Reporter:
            kzeller kzeller
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes