Details
-
Bug
-
Resolution: Not a Bug
-
Critical
-
6.5.0
-
Untriaged
-
-
No
Description
During a jepsen run, I noticed that most of the durable writes are failing with RequestTimeoutException when there is a network partition introduced into the cluster. Going through the scenario, one would expect that while the durable writes to fail with DurabilityAmbiguousException instead. But the writes are mainly failing with RequestTimeoutException
I see the DurabilityAmbiguousException sporadically (Mainly just when the nemesis is introduced)
Steps to reproduce:
By running Jepsen test:
Run following jepsen Test and check the workload when the nemesis is triggered.
lein trampoline run test --nodes-file ./nodes --username root --password couchbase --package ./couchbase-server-enterprise-6.5.0-4242-centos7.x86_64.rpm --workload=partition --replicas=1 --node-count=6 --no-autofailover --durability=0:100:0:0 --disrupt-count=1 --disrupt-time=10 --kv-timeout=5 --doc-count=10000 --doc-threads=1
Steps to reproduce without running Jepsen tests:
1. Create a 6 node cluster with a bucket with replica 1
2. Start a workload with durability level set to Majority
3. Introduce a network split between 1 node and the rest of the cluster (The node is not able to communicate with rest of the cluster and the Cluster is not able to communicate with the node)
4. Notice the exceptions in the workload