Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Won't Do
Priority: Major
Fix Version/s: None
Affects Version/s: 2.7.20
Component/s: Infrastructure
Labels:
- MCA

Story Points:
1

Description

If a client is using the Node Health Failure Detector, and configured their MCA using IP addresses, the download of the cluster map after a cluster switch may put the detector into alert status before the Coordinator has closed the grace period. This could prevent the detector from re-alerting when nodes actually fail because it is already in the alert state.

Sequence of event:

1 - Nodes fail on Cluster 1, detector goes into alert state (RED).

2 - Coordinator enters grace period.

3 - Coordinator switches to Cluster 2, resets detector alert state to GREEN.

4 - Cluster map received, adds nodes using DNS names, disconnects from IP addresses.

5 - Detector picks up the disconnects, goes into alert state (RED).

6 - Coordinator still in grace period, ignores alert, leaves detector in RED state.

7 - When node does fail, Detector picks up but is already in RED state, so no change sent to Coordinator.

Attached a sample from the SDK debug logs showing the sequence.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

mca-sdk-debug-1.log
17 kB
21/Jul/21 1:58 PM

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Michael Nitschinger

Reporter:: Davis Chapman [X] (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 21/Jul/21 1:59 PM

Updated:: 31/Oct/23 4:12 PM

Resolved:: 31/Oct/23 4:12 PM

Gerrit Reviews

There are no open Gerrit changes

MCA Clusters configured with IP addresses may not trigger 2nd alert

Details

Description

Attachments

Attachments

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty