Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.1.0
Description
Testcase: TestPersistentVolumeKillPodAndOperator
Scenario:
- Couchbase cluster of size 4 is created with default persistent volume mount
- Operator and pod 0001 is killed in parallel
Observation:
Once operator respawned, it is creating a new pod 0004 to replace the killed pod without attempting delta recovery of pod 0001
Cluster events:
Event schema validation failed:
|
NewMemberAdded | New member test-couchbase-54gn7-0000 added to cluster
|
NewMemberAdded | New member test-couchbase-54gn7-0001 added to cluster
|
NewMemberAdded | New member test-couchbase-54gn7-0002 added to cluster
|
NewMemberAdded | New member test-couchbase-54gn7-0003 added to cluster
|
RebalanceStarted | A rebalance has been started to balance data across the cluster
|
RebalanceCompleted | A rebalance has completed
|
BucketCreated | A new bucket `PVBucket` was created
|
NewMemberAdded | New member test-couchbase-54gn7-0004 added to cluster | <== no anyof members matched
|
RebalanceStarted | A rebalance has been started to balance data across the cluster
|
MemberRemoved | Existing member test-couchbase-54gn7-0001 removed from the cluster
|
RebalanceCompleted | A rebalance has completed
|
Recreate locally:
time="2018-09-26T09:40:01Z" level=info msg="Node status:" cluster-name=cb-example module=cluster
time="2018-09-26T09:40:01Z" level=info msg="┌─────────────────┬──────────────┬────────────────┐" cluster-name=cb-example module=cluster
time="2018-09-26T09:40:01Z" level=info msg="│ Server │ Class │ Status │" cluster-name=cb-example module=cluster
time="2018-09-26T09:40:01Z" level=info msg="├─────────────────┼──────────────┼────────────────┤" cluster-name=cb-example module=cluster
time="2018-09-26T09:40:01Z" level=info msg="│ cb-example-0000 │ all_services │ managed+active │" cluster-name=cb-example module=cluster
time="2018-09-26T09:40:01Z" level=info msg="│ cb-example-0001 │ all_services │ managed+active │" cluster-name=cb-example module=cluster
time="2018-09-26T09:40:01Z" level=info msg="│ cb-example-0002 │ all_services │ managed+down │" cluster-name=cb-example module=cluster
time="2018-09-26T09:40:01Z" level=info msg="└─────────────────┴──────────────┴────────────────┘" cluster-name=cb-example module=cluster
All running pods and PVCs are correctly aggregated (in updateMembers()) and the state is correct. There is an entry for:
persistentvolumeclaim/pvc-couchbase-test-couchbase-54gn7-0001-00-default/pvc-couchbase-test-couchbase-54gn7-0001-00-default.yaml
In the cbopinfo collection and the labels look correct.
Tommie McAfee anything jump out at you? Seems test-couchbase-54gn7-0001 isn't getting picked up somehow on restart, and I'm not seeing this behaviour in minikube