Uploaded image for project: 'Couchbase Monitoring and Observability Stack'
  1. Couchbase Monitoring and Observability Stack
  2. CMOS-319

Agent: re-initialize if the agent returns to Waiting state

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Major
    • 0.2
    • None
    • cluster-monitor
    • None

    Description

      When the Cluster Monitor starts or adds a new cluster, it creates agent ports for each agent running on a node, which will initialize the agent if it's not already initialized. However, if an agent restarts, loses the credentials file, and goes back to waiting state, we won't pick this up and the next time we try to request its checkers we will fail.

      We should pick this up and re-initialize it. However, this is not trivial, since we only add the HTTP handlers for the health routes once we're initialized, meaning that on an uninitialized agent it'll just 404. Possible approaches:

      • Agent-side: fallback (route not found) handler that returns a not initialized error if it isn't ready or 404s if it is
      • CM-side: just detect a 404 and attempt to re-initialize
        • Possibly use a more specific 404 output / header to make this easier to detect

      Attachments

        Issue Links

          Activity

            People

              marks.polakovs Marks Polakovs (Inactive)
              marks.polakovs Marks Polakovs (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty