Loading...

XML

Word

Printable

Details

Type: Improvement
Resolution: Unresolved
Priority: Major
Fix Version/s: 0.3
Affects Version/s: None
Component/s: cluster-monitor
Labels:
None

Description

Currently the Status loop (which runs the health checks) runs every five minutes, which means that an issue might go unnoticed for up to five minutes, which could lead to inconsistent data in the dashboards and poor UX.

Ideas for how we could improve this:

Just run the checkers more frequently - I'd rather not, since they could quickly overload clusters
Split the checkers into "frequent" and "less frequent" groups that run at different intervals
Re-run some checkers (those that only need "cluster summary" data and nothing else) as soon as the cluster summaries are updated (which is done by the Heart loop every minute)
- Related to that, possibly use streaming / long-polling for updating that data near-instantly

Attachments

Issue Links

relates to

CMOS-335 Use streaming instead of / alongside heartbeat loop

To Do

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Shaashwat Jain

Reporter:: Marks Polakovs (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 03/Dec/21 7:39 AM

Updated:: 05/Dec/22 2:22 AM

Gerrit Reviews

There are no open Gerrit changes

Faster health check updates

Details

Description

Attachments

Issue Links

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty