Details
-
Epic
-
Resolution: Unresolved
-
Major
-
None
-
Node Agent
Description
There is a certain set of information that we simply cannot access without running code directly on the Couchbase Server hosts - for example THP configuration (kernel parameter), or whether there are firewalls between the CBS nodes (need to try establishing TCP connections, ergo, need node-level access).
This is a long-term tracking CMOS for all the work needed to make checks like that possible. The exact implementation is up for discussion - in the long term ideally it'd be embedded within CBS (either ns_server or as a babysitter managed agent), but for the time being we're interested in the lowest friction solutions for incremental adoption.
Attachments
Issue Links
- blocks
-
CMOS-410 Check for NAS storage
- To Do
-
CMOS-262 Add IO Utilisation Checker
- To Do
-
CMOS-241 Segfault Checker
- Done
-
CMOS-252 Add check for firewalls between nodes
- Done
-
CMOS-253 Add check for ulimit values
- Done
-
CMOS-254 Alert on CB processes getting OOM killed
- Done
-
CMOS-255 Check for THP
- Done
-
CMOS-314 Add a check to see if the node uses SAN storage
- Done
-
CMOS-209 Add check for Deprecated/supported OS
- Done
- relates to
-
MB-46882 Collect System Metrics for Health Checks
- Open
-
CMOS-174 7.x Compatible Prometheus Exporter for 6.x clusters
- To Do
Gerrit Reviews
For Gerrit Dashboard: CMOS-210 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
168832,7 | CMOS-210 Agent: Organisation to prepare for adding new features | master | cbmultimanager | Status: MERGED | +2 | +1 |
168833,20 | CMOS-210 Agent: Introduce Prometheus Exporter and bootstrap | master | cbmultimanager | Status: MERGED | +2 | +1 |
168834,22 | CMOS-210 Agent: Graceful shutdown | master | cbmultimanager | Status: MERGED | +2 | +1 |
168858,22 | CMOS-210 Agent: Introduce Fluent Bit | master | cbmultimanager | Status: MERGED | +2 | +1 |
168899,25 | CMOS-210 Agent: Introduce Hazelnut (log analyzer) | master | cbmultimanager | Status: MERGED | +2 | +1 |
168948,26 | CMOS-210 Agent: Move Hazelnut rules into JSON | master | cbmultimanager | Status: MERGED | +2 | +1 |
169061,1 | WIP CMOS-210 Dmesg | master | cbmultimanager | Status: ABANDONED | 0 | 0 |
169136,24 | CMOS-210, CMOS-254 Add analyser for dmesg + OOM kill check | master | cbmultimanager | Status: MERGED | +2 | +1 |
169739,32 | CMOS-210 Agent: refactor main function into Agent struct | master | cbmultimanager | Status: MERGED | +2 | +1 |
169833,36 | CMOS-210 Refactor check run logic | master | cbmultimanager | Status: MERGED | +2 | +1 |
169905,2 | CMOS-210 Add fluent-bit to couchbase-cluster-monitor | master | manifest | Status: MERGED | +2 | +1 |
171115,4 | CMOS-210 Upgrade yacpe to pick up FTS/Eventing/System metrics | master | cbmultimanager | Status: MERGED | +2 | +1 |