Details

    • Type: Task
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Security Level: Public
    • Labels:
      None

      Description

      Moved over from other ticket:

      Hi Michael,

      I have tested the new version 2.9.1 (I see that the delete method with cas is there) and it still fails to delete a locked object. It works with a gets and then a delete (this works without problem) but not with a lock and then a delete. So I cannot delete a locked object with this patch. My code is very simple:

      URI base = new URI("http://localhost:8091/pools");
      ArrayList baseURIs = new ArrayList();
      baseURIs.add(base);
      client = new CouchbaseClient(baseURIs, "default", null, "");
      String key = "dd2982ff68406c937040cd0c509b";
      System.err.println("Creating the object");
      OperationFuture<Boolean> addFuture = client.add(key, 300, "lala");
      System.err.println("Status add: " + addFuture.getStatus());
      System.err.println("Getting and locking the object");
      OperationFuture<CASValue<Object>> future = client.asyncGetAndLock(key, 30);
      System.err.println("Status getl: " + future.getStatus());
      if (!future.getStatus().isSuccess())

      { throw new Exception("Error locking!!!!"); }

      System.err.println("Lock is mine");
      System.err.println("Status getl: " + future.get());
      System.err.println("delete");
      OperationFuture<Boolean> future4 = client.delete(key, future.get().getCas());
      System.err.println("delete: " + future4.getStatus());
      if (!future4.getStatus().isSuccess())

      { throw new Exception("Error deleting!!!"); }

      The error is: "Temporary failure". If the getAndLock is substituted by an asyncGets it works perfectly.

      Is this error something that is going to be fixed in another bug? is this one still in progress?

      1. spy-129.c
        6 kB
        Sergey Avseyev
      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        avsej Sergey Avseyev added a comment -

        It is also reproducing using libcouchbase. See source attached

        Show
        avsej Sergey Avseyev added a comment - It is also reproducing using libcouchbase. See source attached
        Hide
        rickyepoderi Ricky Martin added a comment -

        Does it mean that the problem is in the server part? Because then maybe the bug should be moved from SPY to somewhere else. Thanks!

        Show
        rickyepoderi Ricky Martin added a comment - Does it mean that the problem is in the server part? Because then maybe the bug should be moved from SPY to somewhere else. Thanks!
        Hide
        rickyepoderi Ricky Martin added a comment - - edited

        I've been playing today a bit with the couchbase server. An I detected that the problem is in the "ep-engine/src/stored-value.hh" file. It seems that the delete operation is just aborted if the value is blocked. If you see the "unlocked_softDelete" function it just does the following:

        if (v->isLocked(ep_current_time()))

        { return IS_LOCKED; }

        Then the "ep.cc" file in the "EventuallyPersistentStore::deleteItem" method transform this LOCKED error in the temporary one:

        if (delrv == NOT_FOUND || delrv == INVALID_CAS)

        { rv = (delrv == INVALID_CAS) ? ENGINE_KEY_EEXISTS : ENGINE_KEY_ENOENT; }

        else if (delrv == IS_LOCKED)

        { rv = ENGINE_TMPFAIL; <----------- HERE!!!! }

        else

        { // WAS_CLEAN or WAS_DIRTY rv = ENGINE_SUCCESS; }

        So the problem is in the server part for sure. I've done this little modification in the "stored-value.hh":

        $ diff -p stored-value.hh stored-value.hh.ORIG

            • stored-value.hh 2013-10-25 12:12:50.575935678 +0200
            • stored-value.hh.ORIG 2013-10-25 12:12:05.255936484 +0200
                                    • public:
            • 1152,1158 ****
              return rv;
              }

        ! if (v->isLocked(ep_current_time()) && cas == 0)

        { return IS_LOCKED; }

        — 1152,1158 ----
        return rv;
        }

        ! if (v->isLocked(ep_current_time())) { return IS_LOCKED; }

        I just return an error if the CAS is not passed (another if checks that the CAS is correct). Now my little java example works. Obviously my modification is just a direct fix (I dunno if it has further or collateral effects).

        $java -cp target/couchbase-manager-0.2.0.jar:... es.rickyepoderi.couchbasemanager.Test2
        Creating the object
        Status add:

        {OperationStatus success=true: OK}

        Getting and locking the object
        Status getl:

        {OperationStatus success=true: OK}

        Lock is mine
        Status getl:

        {CasValue 14598213114486/lala}

        delete CAS: 14598213114486
        delete:

        {OperationStatus success=true: OK}

        Thanks!

        Show
        rickyepoderi Ricky Martin added a comment - - edited I've been playing today a bit with the couchbase server. An I detected that the problem is in the "ep-engine/src/stored-value.hh" file. It seems that the delete operation is just aborted if the value is blocked. If you see the "unlocked_softDelete" function it just does the following: if (v->isLocked(ep_current_time())) { return IS_LOCKED; } Then the "ep.cc" file in the "EventuallyPersistentStore::deleteItem" method transform this LOCKED error in the temporary one: if (delrv == NOT_FOUND || delrv == INVALID_CAS) { rv = (delrv == INVALID_CAS) ? ENGINE_KEY_EEXISTS : ENGINE_KEY_ENOENT; } else if (delrv == IS_LOCKED) { rv = ENGINE_TMPFAIL; <----------- HERE!!!! } else { // WAS_CLEAN or WAS_DIRTY rv = ENGINE_SUCCESS; } So the problem is in the server part for sure. I've done this little modification in the "stored-value.hh": $ diff -p stored-value.hh stored-value.hh.ORIG stored-value.hh 2013-10-25 12:12:50.575935678 +0200 stored-value.hh.ORIG 2013-10-25 12:12:05.255936484 +0200 public: 1152,1158 **** return rv; } ! if (v->isLocked(ep_current_time()) && cas == 0) { return IS_LOCKED; } — 1152,1158 ---- return rv; } ! if (v->isLocked(ep_current_time())) { return IS_LOCKED; } I just return an error if the CAS is not passed (another if checks that the CAS is correct). Now my little java example works. Obviously my modification is just a direct fix (I dunno if it has further or collateral effects). $java -cp target/couchbase-manager-0.2.0.jar:... es.rickyepoderi.couchbasemanager.Test2 Creating the object Status add: {OperationStatus success=true: OK} Getting and locking the object Status getl: {OperationStatus success=true: OK} Lock is mine Status getl: {CasValue 14598213114486/lala} delete CAS: 14598213114486 delete: {OperationStatus success=true: OK} Thanks!
        Hide
        rickyepoderi Ricky Martin added a comment -

        I've checked the "set" method in the same file ("stored-value.hh") and it performs more or less the same thing I did in my fix. The only difference is that in the "set" function IS_LOCKED is always returned when the object is locked and cas is wrong (not sent or different), in my fix IS_LOCKED is returned if cas is not sent and INVALID_CAS if cas is different.

        I've also passed the tests in the "ep-engine" and all of them passed:

        PASS: atomic_ptr_test
        PASS: atomic_test
        PASS: chunk_creation_test
        PASS: dispatcher_test
        PASS: hash_table_test
        PASS: histo_test
        PASS: hrtime_test
        PASS: json_test
        PASS: misc_test
        PASS: mutex_test
        PASS: priority_test
        PASS: ringbuffer_test
        ===================
        All 12 tests passed
        ===================
        /home/blog/couchbase/couchbase-server_src/install/bin/engine_testapp \
        \
        -E .libs/ep.so -t 60 \
        -T .libs/ep_testsuite.so \
        -e 'flushall_enabled=true;ht_size=13;ht_locks=7;'
        1..214

        1. Passed 214 of 214 tests

        So I think my solution (or a very similar one) is good enough. But please, check it twice anyways.

        Thanks!

        Show
        rickyepoderi Ricky Martin added a comment - I've checked the "set" method in the same file ("stored-value.hh") and it performs more or less the same thing I did in my fix. The only difference is that in the "set" function IS_LOCKED is always returned when the object is locked and cas is wrong (not sent or different), in my fix IS_LOCKED is returned if cas is not sent and INVALID_CAS if cas is different. I've also passed the tests in the "ep-engine" and all of them passed: PASS: atomic_ptr_test PASS: atomic_test PASS: chunk_creation_test PASS: dispatcher_test PASS: hash_table_test PASS: histo_test PASS: hrtime_test PASS: json_test PASS: misc_test PASS: mutex_test PASS: priority_test PASS: ringbuffer_test =================== All 12 tests passed =================== /home/blog/couchbase/couchbase-server_src/install/bin/engine_testapp \ \ -E .libs/ep.so -t 60 \ -T .libs/ep_testsuite.so \ -e 'flushall_enabled=true;ht_size=13;ht_locks=7;' 1..214 Passed 214 of 214 tests So I think my solution (or a very similar one) is good enough. But please, check it twice anyways. Thanks!
        Hide
        rickyepoderi Ricky Martin added a comment -

        I've just seen that a similar fix was done two months ago:

        https://github.com/couchbase/ep-engine/commit/0cbc4d75e3759fc28a09fe1a4a6ee26ad5651584#diff-981aa56e84093252864ec649ce5d5643

        I feel like an idiot, I've wasted two or three hours with this. At least the solution is more or less the same.

        Sorry but I've tested with last community bundle (2.1.1) and there it was not solved. Can anyone test my code against last trunk and close this bug (I suppose that in 2.2.0 it's also solved).

        Thanks again!

        Show
        rickyepoderi Ricky Martin added a comment - I've just seen that a similar fix was done two months ago: https://github.com/couchbase/ep-engine/commit/0cbc4d75e3759fc28a09fe1a4a6ee26ad5651584#diff-981aa56e84093252864ec649ce5d5643 I feel like an idiot, I've wasted two or three hours with this. At least the solution is more or less the same. Sorry but I've tested with last community bundle (2.1.1) and there it was not solved. Can anyone test my code against last trunk and close this bug (I suppose that in 2.2.0 it's also solved). Thanks again!
        Hide
        rickyepoderi Ricky Martin added a comment -

        I can confirm that this issue is fixed in 2.2.0. I tested it this weekend.

        Please feel free to close the bug.

        Show
        rickyepoderi Ricky Martin added a comment - I can confirm that this issue is fixed in 2.2.0. I tested it this weekend. Please feel free to close the bug.

          People

          • Assignee:
            daschl Michael Nitschinger
            Reporter:
            daschl Michael Nitschinger
          • Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:

              Gerrit Reviews

              There are no open Gerrit changes