Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-7803

[RN]Items need to be loaded from the access log incrementally

    Details

    • Type: Bug
    • Status: Reopened
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.0
    • Fix Version/s: techdebt-backlog
    • Component/s: couchbase-bucket
    • Security Level: Public
    • Labels:
      None
    • Triage:
      Untriaged

      Description

      If the mutation log is large we currently load the entire thing into memory and this can cause us to go over the low watermark and not load any items into memory during warmup. The solution is to incrementally read from the mutation log and load keys until we hit the low watermark.

      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        mikew Mike Wiederhold added a comment -

        It still looks like we are loading in the entire file.

        https://github.com/membase/ep-engine/blob/master/src/mutation_log.cc#L828

        Can you point to the code that fixes this in case I missed something?

        Show
        mikew Mike Wiederhold added a comment - It still looks like we are loading in the entire file. https://github.com/membase/ep-engine/blob/master/src/mutation_log.cc#L828 Can you point to the code that fixes this in case I missed something?
        Hide
        abhinav Abhinav Dangeti added a comment -

        What you're pointing to is the part where we read the keys from the access log, and add to a list.

        We check the memory usage as part of the population phase where we actually insert the read item into memory, which is in the warmup callbacks:
        https://github.com/couchbase/ep-engine/blob/master/src/warmup.cc#L64
        https://github.com/couchbase/ep-engine/blob/master/src/warmup.cc#L108

        Show
        abhinav Abhinav Dangeti added a comment - What you're pointing to is the part where we read the keys from the access log, and add to a list. We check the memory usage as part of the population phase where we actually insert the read item into memory, which is in the warmup callbacks: https://github.com/couchbase/ep-engine/blob/master/src/warmup.cc#L64 https://github.com/couchbase/ep-engine/blob/master/src/warmup.cc#L108
        Hide
        mikew Mike Wiederhold added a comment -

        If I remember correctly, the problem is that we are reading in the entire file before loading any keys. If the file is large we might not actually read enough data from the files.

        Show
        mikew Mike Wiederhold added a comment - If I remember correctly, the problem is that we are reading in the entire file before loading any keys. If the file is large we might not actually read enough data from the files.
        Hide
        abhinav Abhinav Dangeti added a comment -

        The way I see it, all keys need to be loaded into memory in case of VALUE_EVICTION anyway, so only for FULL_EVICTION we'll need to incrementally load items during the reading phase while loading from access log. Let me know if you disagree.

        Show
        abhinav Abhinav Dangeti added a comment - The way I see it, all keys need to be loaded into memory in case of VALUE_EVICTION anyway, so only for FULL_EVICTION we'll need to incrementally load items during the reading phase while loading from access log. Let me know if you disagree.
        Hide
        drigby Dave Rigby added a comment -

        I believe Mike's comments still stand as of current master - MutationLogHarvester::load() is responsible for loading the log from disk (called from Warmup::doWarmup), and it iterates across all blocks in the mutation log, adding items found to MutationLogHarvester::loading - a map of vbucket to a pair of

        {key, mutation_log_event_t}

        .

        The loading map is populated in a single call - so for example if the size of this map pushes memory usage above the high watermark, then fewer items than desired will be loaded. Once the loading map is deleted - when the MutationLogHarvester is deleted (i.e. at the end of Warmup::doWarmup) then memory usage will drop.

        We should look at addressing this - along the same lines as Mike's comment - load keys incrementally, or at least in chunks.

        Show
        drigby Dave Rigby added a comment - I believe Mike's comments still stand as of current master - MutationLogHarvester::load() is responsible for loading the log from disk (called from Warmup::doWarmup ), and it iterates across all blocks in the mutation log, adding items found to MutationLogHarvester::loading - a map of vbucket to a pair of {key, mutation_log_event_t} . The loading map is populated in a single call - so for example if the size of this map pushes memory usage above the high watermark, then fewer items than desired will be loaded. Once the loading map is deleted - when the MutationLogHarvester is deleted (i.e. at the end of Warmup::doWarmup) then memory usage will drop. We should look at addressing this - along the same lines as Mike's comment - load keys incrementally, or at least in chunks.

          People

          • Assignee:
            drigby Dave Rigby
            Reporter:
            mikew Mike Wiederhold
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:

              Gerrit Reviews

              There are no open Gerrit changes