Description
The couchbase demon failed to start during set up. We should handle this case better, if the demon fails to start we should make sure we perform teardown so that that machine is in a clean state. We should also log which node failed to start to help with debugging.
Stack trace of crash (from kv-engine-jepsen-post-commit-145):
2019-07-17 23:51:37,372{GMT} WARN [main] jepsen.core: Test crashed!
|
java.lang.Exception: daemon failed to start
|
at couchbase.util$wait_for_daemon$fn__2787.invoke(util.clj:355) ~[na:na]
|
at couchbase.util$wait_for_daemon.invokeStatic(util.clj:351) ~[na:na]
|
at couchbase.util$wait_for_daemon.invoke(util.clj:345) ~[na:na]
|
at couchbase.util$setup_node.invokeStatic(util.clj:399) ~[na:na]
|
at couchbase.util$setup_node.invoke(util.clj:374) ~[na:na]
|
at couchbase.core$couchbase$reify__4628.setup_BANG_(core.clj:21) ~[na:na]
|
at jepsen.db$fn__2954$G__2933__2958.invoke(db.clj:8) ~[jepsen-0.1.14.jar:na]
|
at jepsen.db$fn__2954$G__2932__2963.invoke(db.clj:8) ~[jepsen-0.1.14.jar:na]
|
at clojure.core$partial$fn__5839.invoke(core.clj:2625) ~[clojure-1.10.1.jar:na]
|
at jepsen.control$on_nodes$fn__2918.invoke(control.clj:391) ~[jepsen-0.1.14.jar:na]
|
at clojure.lang.AFn.applyToHelper(AFn.java:154) ~[clojure-1.10.1.jar:na]
|
at clojure.lang.AFn.applyTo(AFn.java:144) ~[clojure-1.10.1.jar:na]
|
at clojure.core$apply.invokeStatic(core.clj:665) ~[clojure-1.10.1.jar:na]
|
at clojure.core$with_bindings_STAR_.invokeStatic(core.clj:1973) ~[clojure-1.10.1.jar:na]
|
at clojure.core$with_bindings_STAR_.doInvoke(core.clj:1973) ~[clojure-1.10.1.jar:na]
|
at clojure.lang.RestFn.applyTo(RestFn.java:142) ~[clojure-1.10.1.jar:na]
|
at clojure.core$apply.invokeStatic(core.clj:669) ~[clojure-1.10.1.jar:na]
|
at clojure.core$bound_fn_STAR_$fn__5749.doInvoke(core.clj:2003) ~[clojure-1.10.1.jar:na]
|
at clojure.lang.RestFn.invoke(RestFn.java:408) ~[clojure-1.10.1.jar:na]
|
at dom_top.core$real_pmap_helper$build_thread__214$fn__215.invoke(core.clj:146) ~[jepsen-0.1.14.jar:na]
|
at clojure.lang.AFn.applyToHelper(AFn.java:152) ~[clojure-1.10.1.jar:na]
|
at clojure.lang.AFn.applyTo(AFn.java:144) ~[clojure-1.10.1.jar:na]
|
at clojure.core$apply.invokeStatic(core.clj:665) ~[clojure-1.10.1.jar:na]
|
at clojure.core$with_bindings_STAR_.invokeStatic(core.clj:1973) ~[clojure-1.10.1.jar:na]
|
at clojure.core$with_bindings_STAR_.doInvoke(core.clj:1973) ~[clojure-1.10.1.jar:na]
|
at clojure.lang.RestFn.invoke(RestFn.java:425) ~[clojure-1.10.1.jar:na]
|
at clojure.lang.AFn.applyToHelper(AFn.java:156) ~[clojure-1.10.1.jar:na]
|
at clojure.lang.RestFn.applyTo(RestFn.java:132) ~[clojure-1.10.1.jar:na]
|
at clojure.core$apply.invokeStatic(core.clj:669) ~[clojure-1.10.1.jar:na]
|
at clojure.core$bound_fn_STAR_$fn__5749.doInvoke(core.clj:2003) ~[clojure-1.10.1.jar:na]
|
at clojure.lang.RestFn.invoke(RestFn.java:397) ~[clojure-1.10.1.jar:na]
|
at clojure.lang.AFn.run(AFn.java:22) ~[clojure-1.10.1.jar:na]
|
at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_212]
|
Attachments
For Gerrit Dashboard: MB-35155 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
112484,5 | MB-35155 Fix Jepsen crash due to couchbase demon not starting | master | jepsen.couchbase | Status: MERGED | +2 | +1 |