Details

    • Type: Task
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Security Level: Public
    • Labels:
      None

      Description

      Moved over from other ticket:

      Hi Michael,

      I have tested the new version 2.9.1 (I see that the delete method with cas is there) and it still fails to delete a locked object. It works with a gets and then a delete (this works without problem) but not with a lock and then a delete. So I cannot delete a locked object with this patch. My code is very simple:

      URI base = new URI("http://localhost:8091/pools");
      ArrayList baseURIs = new ArrayList();
      baseURIs.add(base);
      client = new CouchbaseClient(baseURIs, "default", null, "");
      String key = "dd2982ff68406c937040cd0c509b";
      System.err.println("Creating the object");
      OperationFuture<Boolean> addFuture = client.add(key, 300, "lala");
      System.err.println("Status add: " + addFuture.getStatus());
      System.err.println("Getting and locking the object");
      OperationFuture<CASValue<Object>> future = client.asyncGetAndLock(key, 30);
      System.err.println("Status getl: " + future.getStatus());
      if (!future.getStatus().isSuccess())

      { throw new Exception("Error locking!!!!"); }

      System.err.println("Lock is mine");
      System.err.println("Status getl: " + future.get());
      System.err.println("delete");
      OperationFuture<Boolean> future4 = client.delete(key, future.get().getCas());
      System.err.println("delete: " + future4.getStatus());
      if (!future4.getStatus().isSuccess())

      { throw new Exception("Error deleting!!!"); }

      The error is: "Temporary failure". If the getAndLock is substituted by an asyncGets it works perfectly.

      Is this error something that is going to be fixed in another bug? is this one still in progress?

        Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

          daschl Michael Nitschinger created issue -
          Hide
          avsej Sergey Avseyev added a comment -

          It is also reproducing using libcouchbase. See source attached

          Show
          avsej Sergey Avseyev added a comment - It is also reproducing using libcouchbase. See source attached
          avsej Sergey Avseyev made changes -
          Field Original Value New Value
          Attachment spy-129.c [ 18390 ]
          Hide
          rickyepoderi Ricky Martin added a comment -

          Does it mean that the problem is in the server part? Because then maybe the bug should be moved from SPY to somewhere else. Thanks!

          Show
          rickyepoderi Ricky Martin added a comment - Does it mean that the problem is in the server part? Because then maybe the bug should be moved from SPY to somewhere else. Thanks!
          Hide
          rickyepoderi Ricky Martin added a comment - - edited

          I've been playing today a bit with the couchbase server. An I detected that the problem is in the "ep-engine/src/stored-value.hh" file. It seems that the delete operation is just aborted if the value is blocked. If you see the "unlocked_softDelete" function it just does the following:

          if (v->isLocked(ep_current_time()))

          { return IS_LOCKED; }

          Then the "ep.cc" file in the "EventuallyPersistentStore::deleteItem" method transform this LOCKED error in the temporary one:

          if (delrv == NOT_FOUND || delrv == INVALID_CAS)

          { rv = (delrv == INVALID_CAS) ? ENGINE_KEY_EEXISTS : ENGINE_KEY_ENOENT; }

          else if (delrv == IS_LOCKED)

          { rv = ENGINE_TMPFAIL; <----------- HERE!!!! }

          else

          { // WAS_CLEAN or WAS_DIRTY rv = ENGINE_SUCCESS; }

          So the problem is in the server part for sure. I've done this little modification in the "stored-value.hh":

          $ diff -p stored-value.hh stored-value.hh.ORIG

              • stored-value.hh 2013-10-25 12:12:50.575935678 +0200
              • stored-value.hh.ORIG 2013-10-25 12:12:05.255936484 +0200
                                      • public:
              • 1152,1158 ****
                return rv;
                }

          ! if (v->isLocked(ep_current_time()) && cas == 0)

          { return IS_LOCKED; }

          — 1152,1158 ----
          return rv;
          }

          ! if (v->isLocked(ep_current_time())) { return IS_LOCKED; }

          I just return an error if the CAS is not passed (another if checks that the CAS is correct). Now my little java example works. Obviously my modification is just a direct fix (I dunno if it has further or collateral effects).

          $java -cp target/couchbase-manager-0.2.0.jar:... es.rickyepoderi.couchbasemanager.Test2
          Creating the object
          Status add:

          {OperationStatus success=true: OK}

          Getting and locking the object
          Status getl:

          {OperationStatus success=true: OK}

          Lock is mine
          Status getl:

          {CasValue 14598213114486/lala}

          delete CAS: 14598213114486
          delete:

          {OperationStatus success=true: OK}

          Thanks!

          Show
          rickyepoderi Ricky Martin added a comment - - edited I've been playing today a bit with the couchbase server. An I detected that the problem is in the "ep-engine/src/stored-value.hh" file. It seems that the delete operation is just aborted if the value is blocked. If you see the "unlocked_softDelete" function it just does the following: if (v->isLocked(ep_current_time())) { return IS_LOCKED; } Then the "ep.cc" file in the "EventuallyPersistentStore::deleteItem" method transform this LOCKED error in the temporary one: if (delrv == NOT_FOUND || delrv == INVALID_CAS) { rv = (delrv == INVALID_CAS) ? ENGINE_KEY_EEXISTS : ENGINE_KEY_ENOENT; } else if (delrv == IS_LOCKED) { rv = ENGINE_TMPFAIL; <----------- HERE!!!! } else { // WAS_CLEAN or WAS_DIRTY rv = ENGINE_SUCCESS; } So the problem is in the server part for sure. I've done this little modification in the "stored-value.hh": $ diff -p stored-value.hh stored-value.hh.ORIG stored-value.hh 2013-10-25 12:12:50.575935678 +0200 stored-value.hh.ORIG 2013-10-25 12:12:05.255936484 +0200 public: 1152,1158 **** return rv; } ! if (v->isLocked(ep_current_time()) && cas == 0) { return IS_LOCKED; } — 1152,1158 ---- return rv; } ! if (v->isLocked(ep_current_time())) { return IS_LOCKED; } I just return an error if the CAS is not passed (another if checks that the CAS is correct). Now my little java example works. Obviously my modification is just a direct fix (I dunno if it has further or collateral effects). $java -cp target/couchbase-manager-0.2.0.jar:... es.rickyepoderi.couchbasemanager.Test2 Creating the object Status add: {OperationStatus success=true: OK} Getting and locking the object Status getl: {OperationStatus success=true: OK} Lock is mine Status getl: {CasValue 14598213114486/lala} delete CAS: 14598213114486 delete: {OperationStatus success=true: OK} Thanks!
          Hide
          rickyepoderi Ricky Martin added a comment -

          I've checked the "set" method in the same file ("stored-value.hh") and it performs more or less the same thing I did in my fix. The only difference is that in the "set" function IS_LOCKED is always returned when the object is locked and cas is wrong (not sent or different), in my fix IS_LOCKED is returned if cas is not sent and INVALID_CAS if cas is different.

          I've also passed the tests in the "ep-engine" and all of them passed:

          PASS: atomic_ptr_test
          PASS: atomic_test
          PASS: chunk_creation_test
          PASS: dispatcher_test
          PASS: hash_table_test
          PASS: histo_test
          PASS: hrtime_test
          PASS: json_test
          PASS: misc_test
          PASS: mutex_test
          PASS: priority_test
          PASS: ringbuffer_test
          ===================
          All 12 tests passed
          ===================
          /home/blog/couchbase/couchbase-server_src/install/bin/engine_testapp \
          \
          -E .libs/ep.so -t 60 \
          -T .libs/ep_testsuite.so \
          -e 'flushall_enabled=true;ht_size=13;ht_locks=7;'
          1..214

          1. Passed 214 of 214 tests

          So I think my solution (or a very similar one) is good enough. But please, check it twice anyways.

          Thanks!

          Show
          rickyepoderi Ricky Martin added a comment - I've checked the "set" method in the same file ("stored-value.hh") and it performs more or less the same thing I did in my fix. The only difference is that in the "set" function IS_LOCKED is always returned when the object is locked and cas is wrong (not sent or different), in my fix IS_LOCKED is returned if cas is not sent and INVALID_CAS if cas is different. I've also passed the tests in the "ep-engine" and all of them passed: PASS: atomic_ptr_test PASS: atomic_test PASS: chunk_creation_test PASS: dispatcher_test PASS: hash_table_test PASS: histo_test PASS: hrtime_test PASS: json_test PASS: misc_test PASS: mutex_test PASS: priority_test PASS: ringbuffer_test =================== All 12 tests passed =================== /home/blog/couchbase/couchbase-server_src/install/bin/engine_testapp \ \ -E .libs/ep.so -t 60 \ -T .libs/ep_testsuite.so \ -e 'flushall_enabled=true;ht_size=13;ht_locks=7;' 1..214 Passed 214 of 214 tests So I think my solution (or a very similar one) is good enough. But please, check it twice anyways. Thanks!
          Hide
          rickyepoderi Ricky Martin added a comment -

          I've just seen that a similar fix was done two months ago:

          https://github.com/couchbase/ep-engine/commit/0cbc4d75e3759fc28a09fe1a4a6ee26ad5651584#diff-981aa56e84093252864ec649ce5d5643

          I feel like an idiot, I've wasted two or three hours with this. At least the solution is more or less the same.

          Sorry but I've tested with last community bundle (2.1.1) and there it was not solved. Can anyone test my code against last trunk and close this bug (I suppose that in 2.2.0 it's also solved).

          Thanks again!

          Show
          rickyepoderi Ricky Martin added a comment - I've just seen that a similar fix was done two months ago: https://github.com/couchbase/ep-engine/commit/0cbc4d75e3759fc28a09fe1a4a6ee26ad5651584#diff-981aa56e84093252864ec649ce5d5643 I feel like an idiot, I've wasted two or three hours with this. At least the solution is more or less the same. Sorry but I've tested with last community bundle (2.1.1) and there it was not solved. Can anyone test my code against last trunk and close this bug (I suppose that in 2.2.0 it's also solved). Thanks again!
          Hide
          rickyepoderi Ricky Martin added a comment -

          I can confirm that this issue is fixed in 2.2.0. I tested it this weekend.

          Please feel free to close the bug.

          Show
          rickyepoderi Ricky Martin added a comment - I can confirm that this issue is fixed in 2.2.0. I tested it this weekend. Please feel free to close the bug.

            People

            • Assignee:
              daschl Michael Nitschinger
              Reporter:
              daschl Michael Nitschinger
            • Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:

                Gerrit Reviews

                There are no open Gerrit changes