Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.0.0
Description
Testcase: TestPersistentVolumeKillAllPods
Scenario:
- Created 4 node Couchbase cluster with PVC defined for all nodes
- Killing all couchbase-server pods
- All pods are recovered back and cluster is rebalanced.
- But recovery event is got only for 2 pods instead of 3 events
Events:
|
Type Reason Age From Message
|
---- ------ ---- ---- -------
|
Normal NewMemberAdded 10m couchbase-operator-585d4b675d-j74s4 New member test-couchbase-xrlsc-0000 added to cluster
|
Normal NewMemberAdded 10m couchbase-operator-585d4b675d-j74s4 New member test-couchbase-xrlsc-0001 added to cluster
|
Normal NewMemberAdded 9m couchbase-operator-585d4b675d-j74s4 New member test-couchbase-xrlsc-0002 added to cluster
|
Normal NewMemberAdded 8m couchbase-operator-585d4b675d-j74s4 New member test-couchbase-xrlsc-0003 added to cluster
|
Normal RebalanceStarted 8m couchbase-operator-585d4b675d-j74s4 A rebalance has been started to balance data across the cluster
|
Normal RebalanceCompleted 8m couchbase-operator-585d4b675d-j74s4 A rebalance has completed
|
Normal BucketCreated 8m couchbase-operator-585d4b675d-j74s4 A new bucket `PVBucket` was created
|
Warning MemberDown 6m (x7 over 7m) couchbase-operator-585d4b675d-j74s4 Existing member test-couchbase-xrlsc-0002 down
|
Warning MemberDown 6m (x8 over 7m) couchbase-operator-585d4b675d-j74s4 Existing member test-couchbase-xrlsc-0000 down
|
Normal MemberRecovered 5m couchbase-operator-585d4b675d-j74s4 Existing member test-couchbase-xrlsc-0000 recovered
|
Warning MemberDown 5m (x8 over 7m) couchbase-operator-585d4b675d-j74s4 Existing member test-couchbase-xrlsc-0003 down
|
Normal MemberRecovered 5m couchbase-operator-585d4b675d-j74s4 Existing member test-couchbase-xrlsc-0003 recovered
|
Normal RebalanceStarted 3m couchbase-operator-585d4b675d-j74s4 A rebalance has been started to balance data across the cluster
|
Normal RebalanceCompleted 3m couchbase-operator-585d4b675d-j74s4 A rebalance has completed
|
I've filed
K8S-541to track dropping of events which can happen for various reasons. Given that the correct things are happening in terms of the cluster being rebuilt by the operator I'm not too concerned about the event being dropped, but I would like to add log messages for when this happens in the future.For 1.0.0 let's re-run this test and see if this happens frequently. Given that I can see that the code was run to raise the event I don't think we should see this issue in test re-runs.