Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-43212

Improve visibility of Warmup failure to users

    XMLWordPrintable

Details

    Description

      Note: This is cloned off from MB-19578 - while the dataloss issue in the orginal MB has been fixed, there is still requests for improvements in how warmup failure is made visible to users, which this MB will track.

      Comment on MB-19578 from David Haikney:

      Mike Wiederhold [X] - quick bit of context is that previously when people went from full eviction to value eviction, we could hide ("lose") some of their data because there wasn't enough space in memory for all of the keys. Now we correctly fail warmup in this scenario. However we don't do a good job of informing the user what is occurring.
      Assigning this to you to see if you can cook up a reasonable way of conveying to the user that their warmup has failed because there is lack of available space for all of their keys.

      And reply by MikeW:

      I'll need to have a discussion with the ns_server team since this needs to be supported in the REST API's. Ideally it would be good to have a separate node status and service status for each service. If a specific service is not available or a node is down it would be good to have an error message that we could display also. I'll see what I can do given the time remaining in Spock, but I won't have much time to take on another improvement in Spock for another two weeks.

      Original Description

      When a user switches from full eviction to value eviction, if the total meta data is bigger than the memory quota of the bucket. Then during warmup keys will not be loaded, the warmup will complete and the bucket will come online. As the key is not in memory, bucket now believes the key does exist, and basically the data is lost!

      As a work around if the user is lucky enough not to do any updates, then they can change the eviction policy back to full and get access to the the data back.

      Unfortunately the only way the user would know that this has happened if they check the memcached.log or if they see a drop in the item count.

      Reproduce

      1. Create a single node cluster
      2. Create default bucket with 100MB quota and full eviction
      3. Insert 3 million items into the bucket

        cbc-pillowfight  -I 3000000 -p superlongnameofkey1234567890123456789012345678902 -M 1
        

      4. Note the item count
      5. Change the eviction policy
      6. During warmup the estimated item count is corrected
      7. When warmup has finished the item count is wrong.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              shivani.gupta Shivani Gupta
              drigby Dave Rigby (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty