Details
Description
Jepsen tests fail as Jepsen framework is unable add nodes into a cluster. Crash occurs as the rest call to add nodes made from the first node (172.23.105.3) throws an exception and returns a HTTP status 500 error with Unexpected Server error.
An example of a failed Jepsen test -
http://qa.sc.couchbase.com/job/jepsen-durability-failover-daily-new/115/consoleFull
lein trampoline run test --nodes-file ./nodes --username root --password couchbase --package ./couchbase-server-enterprise-7.0.0-1576-centos7.x86_64.rpm --workload=failover --node-count=6 --no-autofailover --replicas=1 --failover-type=hard --recovery-type=full --disrupt-count=1 --kv-timeout=1.5 --durability=0:100:0:0 --doc-count=4000 --doc-threads=1 &> jepsen-output-1.log |
To reproduce this manually:
1. Spin up 2 nodes with vagrant - 10.112.194.101 & 10.112.194.102 with the above server-build.
2. Setup a new one node cluster with 10.112.194.101
3. Add the second node to this cluster
curl -u Administrator:abc123 \10.112.194.101:8091/controller/addNode \-d 'hostname=10.112.194.102&user=Administrator&password=abc123' |
Gives, "
["Prepare join failed. Got HTTP status 500 from REST call post to https://10.112.194.102:18091/engageCluster2. Body was: \"\\\"Unexpected server error, request logged.\\\"\""] "
Sanity for this server build has failed. http://server.jenkins.couchbase.com/job/build_sanity_matrix/7191/DISTRO=suse12,TYPE=4node/consoleText
Attaching the ns-server error log, and jepsen-logs and a few screenshots (all from Jepsen test)