Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-35078

[Rebalance failure] GSL: Precondition failure being triggered when HCS > HPS [2019/7/19]

    XMLWordPrintable

Details

    • Untriaged
    • No
    • KV-Engine Mad-Hatter Beta

    Description

      During kv-engine-jepsen-post-commit-131 we observe GSL: precondition failure being triggered because the HCS > HPS.

      2019-07-15T09:18:51.497858-07:00 ERROR 56: exception occurred in runloop during packet execution. Cookie info: [{"aiostat":"success","connection":"[ 127.0.0.1:41574 - 127.0.0.1:11209 (<ud>@ns_server</ud>) ]","engine_storage":"0x00007f7749dbf010","ewouldblock":false,"packet":{"bodylen":20,"cas":0,"datatype":"raw","extlen":20,"key":"<ud></ud>","keylen":0,"magic":"ClientRequest","opaque":8,"opcode":"DCP_SNAPSHOT_MARKER","vbucket":688},"refcount":1}] - closing connection ([ 127.0.0.1:41574 - 127.0.0.1:11209 (<ud>@ns_server</ud>) ]): GSL: Precondition failure at /home/couchbase/jenkins/workspace/kv-engine-jepsen-post-commit/kv_engine/engines/ep/src/vbucket.cc: 3895
      

      Which then goes on to lead to a Rebalance failure for the Jepsen test

      2019-07-15 09:19:01,266{GMT}	WARN	[jepsen nemesis] jepsen.core: Process :nemesis crashed
      java.lang.RuntimeException: Rebalance failed
      	at couchbase.util$wait_for_rebalance_complete.invokeStatic(util.clj:123) ~[na:na]
      	at couchbase.util$wait_for_rebalance_complete.invoke(util.clj:112) ~[na:na]
      	at couchbase.util$wait_for_rebalance_complete.invokeStatic(util.clj:114) ~[na:na]
      	at couchbase.util$wait_for_rebalance_complete.invoke(util.clj:112) ~[na:na]
      	at couchbase.util$wait_for_rebalance_complete.invokeStatic(util.clj:113) ~[na:na]
      	at couchbase.util$wait_for_rebalance_complete.invoke(util.clj:112) ~[na:na]
      	at couchbase.util$rebalance.invokeStatic(util.clj:153) ~[na:na]
      	at couchbase.util$rebalance.invoke(util.clj:136) ~[na:na]
      	at couchbase.util$rebalance.invokeStatic(util.clj:138) ~[na:na]
      	at couchbase.util$rebalance.invoke(util.clj:136) ~[na:na]
      	at couchbase.nemesis$couchbase$reify__3367.invoke_BANG_(nemesis.clj:181) ~[na:na]
      	at jepsen.nemesis$invoke_compat_BANG_.invokeStatic(nemesis.clj:40) ~[jepsen-0.1.14.jar:na]
      	at jepsen.nemesis$invoke_compat_BANG_.invoke(nemesis.clj:36) ~[jepsen-0.1.14.jar:na]
      	at couchbase.core$_main$fn__4709$fn__4710.invoke(core.clj:236) ~[na:na]
      .....
      

      Method that contains the GSL precondition:

      void VBucket::setUpAllowedDuplicatePrepareWindow() {
          auto& dm = getDurabilityMonitor();
          auto hcs = dm.getHighCompletedSeqno();
          auto hps = dm.getHighPreparedSeqno();
          Expects(hcs <= hps);
       
          int64_t newDuplicateCount = hps - hcs;
          allowedDuplicatePrepareSeqnos.reserve(allowedDuplicatePrepareSeqnos.size() +
                                                newDuplicateCount);
       
          for (int64_t dupSeqno = hcs + 1; dupSeqno <= hps; dupSeqno++) {
              allowedDuplicatePrepareSeqnos.insert(dupSeqno);
          }
      }
      

      To run the jepsen test use the following command:

      lein trampoline run test --nodes-file ./nodes --username vagrant --ssh-private-key ./resources/vagrantkey --package /home/couchbase/jenkins/workspace/kv-engine-jepsen-post-commit/install  --workload=failover --failover-type=hard --recovery-type=delta --replicas=2 --no-autofailover --disrupt-count=1 --rate=0 --durability=0:100:0:0 --eviction-policy=value
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              james.harrison James Harrison (Inactive)
              richard.demellow Richard deMellow
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty