Details
-
Bug
-
Resolution: Done
-
Major
-
None
-
None
Description
When the Cluster Monitor starts or adds a new cluster, it creates agent ports for each agent running on a node, which will initialize the agent if it's not already initialized. However, if an agent restarts, loses the credentials file, and goes back to waiting state, we won't pick this up and the next time we try to request its checkers we will fail.
We should pick this up and re-initialize it. However, this is not trivial, since we only add the HTTP handlers for the health routes once we're initialized, meaning that on an uninitialized agent it'll just 404. Possible approaches:
- Agent-side: fallback (route not found) handler that returns a not initialized error if it isn't ready or 404s if it is
- CM-side: just detect a 404 and attempt to re-initialize
- Possibly use a more specific 404 output / header to make this easier to detect
Attachments
Issue Links
- is triggered by
-
CMOS-276 Agent: Rework authentication and bootstrapping
- Done