Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-7558

We don't replicate the lock time related to "get and lock"

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.8.1, 2.0
    • Fix Version/s: feature-backlog
    • Component/s: couchbase-bucket
    • Security Level: Public
    • Labels:
      None
    • Triage:
      Untriaged

      Description

      The thing is that it will grow the disk queue, but yes it is a bug. Please file it and assign it to me. I'm almost positive that we don't persist items when the lock time changes.

      On Jan 20, 2013, at 2:20 PM, "Matt Ingenthron" <matt@couchbase.com> wrote:

      On 1/20/13 2:18 PM, "Mike Wiederhold" <mike@couchbase.com> wrote:

      I don't think we persist the lock time so it wouldn't survive an ejection.

      Given that ejection is mostly random, that seems like it could be a problem.

      Is that by intent? Shouldn't we prevent ejection for locked items (as a normal approach)?

      Matt

      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        mikew Mike Wiederhold added a comment -

        Since locks are short lived I think it would make sense for us to just not allow eviction of a locked item.

        Show
        mikew Mike Wiederhold added a comment - Since locks are short lived I think it would make sense for us to just not allow eviction of a locked item.
        Hide
        mikew Mike Wiederhold added a comment -

        I just realize that I was completely wrong about what I said in the email. Lock time is part of the meta data which never gets ejected. The only time you would lose the lock time is if you restarted the server, but given that our max lock time is 30 seconds and that restarting the server will take a least this long I don't think this is an issue. Perry, please close this bug unless you have any other comments.

        Show
        mikew Mike Wiederhold added a comment - I just realize that I was completely wrong about what I said in the email. Lock time is part of the meta data which never gets ejected. The only time you would lose the lock time is if you restarted the server, but given that our max lock time is 30 seconds and that restarting the server will take a least this long I don't think this is an issue. Perry, please close this bug unless you have any other comments.
        Hide
        perry Perry Krug added a comment -

        Thanks Mike...switching gears on this, what about replicating the lock time so that it survives a failover?

        Show
        perry Perry Krug added a comment - Thanks Mike...switching gears on this, what about replicating the lock time so that it survives a failover?
        Hide
        mikew Mike Wiederhold added a comment -

        Auto-failover takes at least 30 seconds to realize a node is down which would again make this essentially useless to replicate. I also don't see why anyone would want to fail over a server in less than 30 seconds in production.

        Show
        mikew Mike Wiederhold added a comment - Auto-failover takes at least 30 seconds to realize a node is down which would again make this essentially useless to replicate. I also don't see why anyone would want to fail over a server in less than 30 seconds in production.
        Hide
        perry Perry Krug added a comment -

        In fact they would...and do

        -We have our own bugs/improvements open to drastically reduce the amount of time to failover
        -Some customers have written their own tools to detect a failure in ms and initiate a failover
        -Sometimes we do a failover on a "live" node that is still replicating data

        I agree with the persistence argument, but think that this lock time needs to be replicated.

        Show
        perry Perry Krug added a comment - In fact they would...and do -We have our own bugs/improvements open to drastically reduce the amount of time to failover -Some customers have written their own tools to detect a failure in ms and initiate a failover -Sometimes we do a failover on a "live" node that is still replicating data I agree with the persistence argument, but think that this lock time needs to be replicated.
        Hide
        maria Maria McDuff (Inactive) added a comment -

        Perry,

        is this a must-fix for the next release (targeting Dec) or can this be deferred?
        pls suggest if it is.

        Show
        maria Maria McDuff (Inactive) added a comment - Perry, is this a must-fix for the next release (targeting Dec) or can this be deferred? pls suggest if it is.
        Hide
        perry Perry Krug added a comment -

        No, this can be deferred from that release.

        Show
        perry Perry Krug added a comment - No, this can be deferred from that release.
        Hide
        mikew Mike Wiederhold added a comment -

        This will be resolved with the UPR changes

        Show
        mikew Mike Wiederhold added a comment - This will be resolved with the UPR changes
        Hide
        maria Maria McDuff (Inactive) added a comment -

        Mike, is this now part of UPR changes in 3.0?

        Show
        maria Maria McDuff (Inactive) added a comment - Mike, is this now part of UPR changes in 3.0?

          People

          • Assignee:
            chiyoung Chiyoung Seo
            Reporter:
            perry Perry Krug
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:

              Gerrit Reviews

              There are no open Gerrit changes