Loading...

XML

Word

Printable

Details

Type: Improvement
Resolution: Unresolved
Priority: Minor
Fix Version/s: not-targeted
Affects Version/s: None
Component/s: cluster-monitor
Labels:
None

Description

From KubeCon 2021 Intuit presentation: consider deploying fluentd as an aggregator for fluent bit forwarding - simple configuration of fluent bit then always/optionally to forward.

The aggregator can be dynamically updated, scaled independently (e.g. to 0 as well), etc. as required. It makes configuration of back end for logging to be independent of the sidecars.

There is the potential as well to reduce the effort done in the sidecar and just push this to the aggregator, particularly multiline and enrichment can be done in batches rather than per log line.

Covered in my previous proposal.

Now we have logs being shipped out, we need a good mechanism to collect and aggregate them to provide a simple system overview. This would also simplify support collection of logs from a single point.

A quick summary of the 12-factor recommendation is in order here:
In staging or production deploys, each process’ stream will be captured by the execution environment, collated together with all other streams from the app, and routed to one or more final destinations for viewing and long-term archival. These archival destinations are not visible to or configurable by the app, and instead are completely managed by the execution environment. Open-source log routers (such as Logplex and Fluentd) are available for this purpose.

The event stream for an app can be routed to a file, or watched via realtime tail in a terminal. Most significantly, the stream can be sent to a log indexing and analysis system such as Splunk, or a general-purpose data warehousing system such as Hadoop/Hive. These systems allow for great power and flexibility for introspecting an app’s behavior over time

We should aggregate the information from the server pods as well as the operator and DAC ones too at the very least. We also should extend this with kubernetes events information (e.g. pod scheduling, eviction, node failure, etc.) and anything else that might be useful. This will allow us to get a much fuller understanding of the system at any point in time.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

Screenshot from 2021-05-08 15-55-04.png
926 kB
08/May/21 8:11 AM

Issue Links

relates to

CMOS-169 Consider logging output from some CMOS components into loki

To Do

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Patrick Stephens (Inactive)

Reporter:: Patrick Stephens (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 08/May/21 7:59 AM

Updated:: 17/Nov/21 11:01 AM

Gerrit Reviews

There are no open Gerrit changes

Observability aggregator support

Details

Description

Attachments

Attachments

Issue Links

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty