Uploaded image for project: 'Couchbase Kubernetes'
  1. Couchbase Kubernetes
  2. K8S-2108

Fluent bit improvements

    XMLWordPrintable

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Done
    • None
    • 2.3.0
    • logging, operator
    • None
    • 1

    Description

      Various improvements found that may be useful during testing:

      1. Include cluster name when enriching the log data
      2. Reduce the tail refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
      3. Provide full integration tests with CI
      4. Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112
      5. Look to see if we can provide counters for various errors and/or prometheus metrics (optionally): coming soon in FB but also see https://github.com/neiman-marcus/fluent-bit-out-prometheus-metrics 
      6. Add Docker-compose stack as an example for local usage.
      7. Reduce cyclometric complexity and refactor watcher to simplify, standardise logging as per operator too.
      8. Add unit tests for watcher functionality - all covered by integration tests currently so shift left if possible.
      9. Document GKE set up - issues with Autopilot and Promtail. Stalling of loki input. Working now it seems without PV: https://github.com/patrick-stephens/couchbase-gitops/tree/main/gke 
      10. Rewrite the 4 letter level names for java logs. Good example of sorting case out as well: https://github.com/sassoftware/viya4-monitoring-kubernetes/blob/eaaf0498f835cbabbcf9f55715ddeafae2d68ca5/logging/fb/fluent-bit_config.configmap_open.yaml#L731 
      11. Ensure we test mount path changes, i.e. that we pick up the config from there/mount it in and can watch for changes: K8S-2324

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            patrick.stephens Patrick Stephens (Inactive) created issue -
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Field Original Value New Value
            Fix Version/s not-targeted [ 16613 ]
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Description Various improvements found that may be useful during testing:
            1. Include cluster name when enriching the log data
            2. Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
            3. Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
            Various improvements found that may be useful during testing:
            1. Include cluster name when enriching the log data
            2. Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
            3. Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
            4. Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config.
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Description Various improvements found that may be useful during testing:
            1. Include cluster name when enriching the log data
            2. Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
            3. Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
            4. Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config.
            Various improvements found that may be useful during testing:
            1. Include cluster name when enriching the log data
            2. Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
            3. Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
            4. Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config.
            5. Look to see if we can provide counters for various errors and/or prometheus metrics (optionally)
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Description Various improvements found that may be useful during testing:
            1. Include cluster name when enriching the log data
            2. Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
            3. Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
            4. Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config.
            5. Look to see if we can provide counters for various errors and/or prometheus metrics (optionally)
            Various improvements found that may be useful during testing:
            1. Include cluster name when enriching the log data
            2. Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
            3. Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
            4. Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config.
            5. Look to see if we can provide counters for various errors and/or prometheus metrics (optionally)
            6. Add Docker-compose stack as an example for local usage.
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Description Various improvements found that may be useful during testing:
            1. Include cluster name when enriching the log data
            2. Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
            3. Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
            4. Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config.
            5. Look to see if we can provide counters for various errors and/or prometheus metrics (optionally)
            6. Add Docker-compose stack as an example for local usage.
            Various improvements found that may be useful during testing:
            1. Include cluster name when enriching the log data
            2. Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
            3. Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
            4. Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config.
            5. Look to see if we can provide counters for various errors and/or prometheus metrics (optionally)
            6. Add Docker-compose stack as an example for local usage.
            7. Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Description Various improvements found that may be useful during testing:
            1. Include cluster name when enriching the log data
            2. Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
            3. Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
            4. Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config.
            5. Look to see if we can provide counters for various errors and/or prometheus metrics (optionally)
            6. Add Docker-compose stack as an example for local usage.
            7. Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
            Various improvements found that may be useful during testing:
             # Include cluster name when enriching the log data
             # Documentation on fd limits, etc. for ifsnotify and the like
             # Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
             # Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
             # Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config.
             # Look to see if we can provide counters for various errors and/or prometheus metrics (optionally)
             # Add Docker-compose stack as an example for local usage.
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Description Various improvements found that may be useful during testing:
             # Include cluster name when enriching the log data
             # Documentation on fd limits, etc. for ifsnotify and the like
             # Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
             # Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
             # Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config.
             # Look to see if we can provide counters for various errors and/or prometheus metrics (optionally)
             # Add Docker-compose stack as an example for local usage.
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
            Various improvements found that may be useful during testing:
             # Include cluster name when enriching the log data
             # Documentation on fd limits, etc. for ifsnotify and the like
             # Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
             # Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
             # Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112
             # Look to see if we can provide counters for various errors and/or prometheus metrics (optionally)
             # Add Docker-compose stack as an example for local usage.
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Link This issue relates to K8S-2112 [ K8S-2112 ]
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Description Various improvements found that may be useful during testing:
             # Include cluster name when enriching the log data
             # Documentation on fd limits, etc. for ifsnotify and the like
             # Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
             # Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
             # Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112
             # Look to see if we can provide counters for various errors and/or prometheus metrics (optionally)
             # Add Docker-compose stack as an example for local usage.
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
            Various improvements found that may be useful during testing:
             # Include cluster name when enriching the log data
             # Documentation on fd limits, etc. for ifsnotify and the like
             # Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
             # Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
             # Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112
             # Look to see if we can provide counters for various errors and/or prometheus metrics (optionally)
             # Add Docker-compose stack as an example for local usage.
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
             # Reduce cyclometric complexity and refactor watcher to simplify - standardise logging as per operator too.
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Description Various improvements found that may be useful during testing:
             # Include cluster name when enriching the log data
             # Documentation on fd limits, etc. for ifsnotify and the like
             # Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
             # Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
             # Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112
             # Look to see if we can provide counters for various errors and/or prometheus metrics (optionally)
             # Add Docker-compose stack as an example for local usage.
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
             # Reduce cyclometric complexity and refactor watcher to simplify - standardise logging as per operator too.
            Various improvements found that may be useful during testing:
             # Include cluster name when enriching the log data
             # Documentation on fd limits, etc. for ifsnotify and the like
             # Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
             # Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
             # Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112
             # Look to see if we can provide counters for various errors and/or prometheus metrics (optionally)
             # Add Docker-compose stack as an example for local usage.
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
             # Reduce cyclometric complexity and refactor watcher to simplify - standardise logging as per operator too.
             # Add unit tests for watcher functionality - all covered by integration tests currently so shift left if possible.
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Description Various improvements found that may be useful during testing:
             # Include cluster name when enriching the log data
             # Documentation on fd limits, etc. for ifsnotify and the like
             # Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
             # Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
             # Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112
             # Look to see if we can provide counters for various errors and/or prometheus metrics (optionally)
             # Add Docker-compose stack as an example for local usage.
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
             # Reduce cyclometric complexity and refactor watcher to simplify - standardise logging as per operator too.
             # Add unit tests for watcher functionality - all covered by integration tests currently so shift left if possible.
            Various improvements found that may be useful during testing:
             # Include cluster name when enriching the log data
             # Documentation on fd limits, etc. for ifsnotify and the like
             # Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
             # Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
             # Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112
             # Look to see if we can provide counters for various errors and/or prometheus metrics (optionally)
             # Add Docker-compose stack as an example for local usage.
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
             # -Reduce cyclometric complexity and refactor watcher to simplify- - standardise logging as per operator too.
             # -Add unit tests for watcher functionality - all covered by integration tests currently so shift left if possible.-
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Description Various improvements found that may be useful during testing:
             # Include cluster name when enriching the log data
             # Documentation on fd limits, etc. for ifsnotify and the like
             # Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
             # Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
             # Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112
             # Look to see if we can provide counters for various errors and/or prometheus metrics (optionally)
             # Add Docker-compose stack as an example for local usage.
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
             # -Reduce cyclometric complexity and refactor watcher to simplify- - standardise logging as per operator too.
             # -Add unit tests for watcher functionality - all covered by integration tests currently so shift left if possible.-
            Various improvements found that may be useful during testing:
             # Include cluster name when enriching the log data
             # Documentation on fd limits, etc. for ifsnotify and the like
             # Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
             # Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
             # Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112
             # Look to see if we can provide counters for various errors and/or prometheus metrics (optionally)
             # -Add Docker-compose stack as an example for local usage.-
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
             # -Reduce cyclometric complexity and refactor watcher to simplify- -- standardise logging as per operator too.-
             # -Add unit tests for watcher functionality - all covered by integration tests currently so shift left if possible.-
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Description Various improvements found that may be useful during testing:
             # Include cluster name when enriching the log data
             # Documentation on fd limits, etc. for ifsnotify and the like
             # Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
             # Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
             # Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112
             # Look to see if we can provide counters for various errors and/or prometheus metrics (optionally)
             # -Add Docker-compose stack as an example for local usage.-
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
             # -Reduce cyclometric complexity and refactor watcher to simplify- -- standardise logging as per operator too.-
             # -Add unit tests for watcher functionality - all covered by integration tests currently so shift left if possible.-
            Various improvements found that may be useful during testing:
             # Include cluster name when enriching the log data
             # Documentation on fd limits, etc. for ifsnotify and the like
             # Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
             # Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
             # Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112
             # Look to see if we can provide counters for various errors and/or prometheus metrics (optionally)
             # -Add Docker-compose stack as an example for local usage.-
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
             # -Reduce cyclometric complexity and refactor watcher to simplify- – standardise logging as per operator too.-
             # -Add unit tests for watcher functionality - all covered by integration tests currently so shift left if possible.-
             # Document GKE set up - issues with Autopilot and Promtail. Stalling of loki input.
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Description Various improvements found that may be useful during testing:
             # Include cluster name when enriching the log data
             # Documentation on fd limits, etc. for ifsnotify and the like
             # Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
             # Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
             # Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112
             # Look to see if we can provide counters for various errors and/or prometheus metrics (optionally)
             # -Add Docker-compose stack as an example for local usage.-
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
             # -Reduce cyclometric complexity and refactor watcher to simplify- – standardise logging as per operator too.-
             # -Add unit tests for watcher functionality - all covered by integration tests currently so shift left if possible.-
             # Document GKE set up - issues with Autopilot and Promtail. Stalling of loki input.
            Various improvements found that may be useful during testing:
             # Include cluster name when enriching the log data
             # Documentation on fd limits, etc. for ifsnotify and the like
             # Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
             # Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
             # Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112
             # Look to see if we can provide counters for various errors and/or prometheus metrics (optionally)
             # -Add Docker-compose stack as an example for local usage.-
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
             # -Reduce cyclometric complexity and refactor watcher to simplify- – standardise logging as per operator too.-
             # -Add unit tests for watcher functionality - all covered by integration tests currently so shift left if possible.-
             # Document GKE set up - issues with Autopilot and Promtail. Stalling of loki input. Working now it seems without PV: [https://github.com/patrick-stephens/couchbase-gitops/tree/main/gke
             # Rewrite the 4 letter level names for java logs
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Description Various improvements found that may be useful during testing:
             # Include cluster name when enriching the log data
             # Documentation on fd limits, etc. for ifsnotify and the like
             # Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
             # Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
             # Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112
             # Look to see if we can provide counters for various errors and/or prometheus metrics (optionally)
             # -Add Docker-compose stack as an example for local usage.-
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
             # -Reduce cyclometric complexity and refactor watcher to simplify- – standardise logging as per operator too.-
             # -Add unit tests for watcher functionality - all covered by integration tests currently so shift left if possible.-
             # Document GKE set up - issues with Autopilot and Promtail. Stalling of loki input. Working now it seems without PV: [https://github.com/patrick-stephens/couchbase-gitops/tree/main/gke
             # Rewrite the 4 letter level names for java logs
            Various improvements found that may be useful during testing:
             # Include cluster name when enriching the log data
             # Documentation on fd limits, etc. for ifsnotify and the like
             # Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
             # Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
             # Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112
             # Look to see if we can provide counters for various errors and/or prometheus metrics (optionally)
             # -Add Docker-compose stack as an example for local usage.-
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
             # -Reduce cyclometric complexity and refactor watcher to simplify- – standardise logging as per operator too.-
             # -Add unit tests for watcher functionality - all covered by integration tests currently so shift left if possible.-
             # Document GKE set up - issues with Autopilot and Promtail. Stalling of loki input. Working now it seems without PV: [https://github.com/patrick-stephens/couchbase-gitops/tree/main/gke
             # Rewrite the 4 letter level names for java logs.
             # Ensure we test mount path changes, i.e. that we pick up the config from there/mount it in and can watch for changes.
            simon.murray Simon Murray made changes -
            Rank Ranked higher
            simon.murray Simon Murray made changes -
            Rank Ranked higher
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Description Various improvements found that may be useful during testing:
             # Include cluster name when enriching the log data
             # Documentation on fd limits, etc. for ifsnotify and the like
             # Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.
             # Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018
             # Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112
             # Look to see if we can provide counters for various errors and/or prometheus metrics (optionally)
             # -Add Docker-compose stack as an example for local usage.-
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
             # -Reduce cyclometric complexity and refactor watcher to simplify- – standardise logging as per operator too.-
             # -Add unit tests for watcher functionality - all covered by integration tests currently so shift left if possible.-
             # Document GKE set up - issues with Autopilot and Promtail. Stalling of loki input. Working now it seems without PV: [https://github.com/patrick-stephens/couchbase-gitops/tree/main/gke
             # Rewrite the 4 letter level names for java logs.
             # Ensure we test mount path changes, i.e. that we pick up the config from there/mount it in and can watch for changes.
            Various improvements found that may be useful during testing:
             # Include cluster name when enriching the log data
             # Documentation on fd limits, etc. for ifsnotify and the like
             # -Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.-
             # -Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018-
             # Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112
             # Look to see if we can provide counters for various errors and/or prometheus metrics (optionally): coming soon in FB but also see [https://github.com/neiman-marcus/fluent-bit-out-prometheus-metrics
             # -Add Docker-compose stack as an example for local usage.-
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
             # -Reduce cyclometric complexity and refactor watcher to simplify- – standardise logging as per operator too.-
             # -Add unit tests for watcher functionality - all covered by integration tests currently so shift left if possible.-
             # Document GKE set up - issues with Autopilot and Promtail. Stalling of loki input. Working now it seems without PV: [https://github.com/patrick-stephens/couchbase-gitops/tree/main/gke
             # Rewrite the 4 letter level names for java logs. Good example of sorting case out as well: [https://github.com/sassoftware/viya4-monitoring-kubernetes/blob/eaaf0498f835cbabbcf9f55715ddeafae2d68ca5/logging/fb/fluent-bit_config.configmap_open.yaml#L731
             # Ensure we test mount path changes, i.e. that we pick up the config from there/mount it in and can watch for changes.
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Link This issue relates to K8S-2147 [ K8S-2147 ]
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Description Various improvements found that may be useful during testing:
             # Include cluster name when enriching the log data
             # Documentation on fd limits, etc. for ifsnotify and the like
             # -Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.-
             # -Provide full integration tests with CI - already present locally but unable to run until https://issues.couchbase.com/browse/CBD-4018-
             # Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112
             # Look to see if we can provide counters for various errors and/or prometheus metrics (optionally): coming soon in FB but also see [https://github.com/neiman-marcus/fluent-bit-out-prometheus-metrics
             # -Add Docker-compose stack as an example for local usage.-
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
             # -Reduce cyclometric complexity and refactor watcher to simplify- – standardise logging as per operator too.-
             # -Add unit tests for watcher functionality - all covered by integration tests currently so shift left if possible.-
             # Document GKE set up - issues with Autopilot and Promtail. Stalling of loki input. Working now it seems without PV: [https://github.com/patrick-stephens/couchbase-gitops/tree/main/gke
             # Rewrite the 4 letter level names for java logs. Good example of sorting case out as well: [https://github.com/sassoftware/viya4-monitoring-kubernetes/blob/eaaf0498f835cbabbcf9f55715ddeafae2d68ca5/logging/fb/fluent-bit_config.configmap_open.yaml#L731
             # Ensure we test mount path changes, i.e. that we pick up the config from there/mount it in and can watch for changes.
            Various improvements found that may be useful during testing:
             # Include cluster name when enriching the log data
             # Documentation on fd limits, etc. for ifsnotify and the like
             # -Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.-
             # -Provide full integration tests with CI-
             # -Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112-
             # Look to see if we can provide counters for various errors and/or prometheus metrics (optionally): coming soon in FB but also see [https://github.com/neiman-marcus/fluent-bit-out-prometheus-metrics
             # -Add Docker-compose stack as an example for local usage.-
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
             # -Reduce cyclometric complexity and refactor watcher to simplify, standardise logging as per operator too.-
             # -Add unit tests for watcher functionality - all covered by integration tests currently so shift left if possible.-
             # Document GKE set up - issues with Autopilot and Promtail. Stalling of loki input. Working now it seems without PV: [https://github.com/patrick-stephens/couchbase-gitops/tree/main/gke
             # -Rewrite the 4 letter level names for java logs. Good example of sorting case out as well: [https://github.com/sassoftware/viya4-monitoring-kubernetes/blob/eaaf0498f835cbabbcf9f55715ddeafae2d68ca5/logging/fb/fluent-bit_config.configmap_open.yaml#L731]- 
             # Ensure we test mount path changes, i.e. that we pick up the config from there/mount it in and can watch for changes.
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Description Various improvements found that may be useful during testing:
             # Include cluster name when enriching the log data
             # Documentation on fd limits, etc. for ifsnotify and the like
             # -Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.-
             # -Provide full integration tests with CI-
             # -Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112-
             # Look to see if we can provide counters for various errors and/or prometheus metrics (optionally): coming soon in FB but also see [https://github.com/neiman-marcus/fluent-bit-out-prometheus-metrics
             # -Add Docker-compose stack as an example for local usage.-
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
             # -Reduce cyclometric complexity and refactor watcher to simplify, standardise logging as per operator too.-
             # -Add unit tests for watcher functionality - all covered by integration tests currently so shift left if possible.-
             # Document GKE set up - issues with Autopilot and Promtail. Stalling of loki input. Working now it seems without PV: [https://github.com/patrick-stephens/couchbase-gitops/tree/main/gke
             # -Rewrite the 4 letter level names for java logs. Good example of sorting case out as well: [https://github.com/sassoftware/viya4-monitoring-kubernetes/blob/eaaf0498f835cbabbcf9f55715ddeafae2d68ca5/logging/fb/fluent-bit_config.configmap_open.yaml#L731]- 
             # Ensure we test mount path changes, i.e. that we pick up the config from there/mount it in and can watch for changes.
            Various improvements found that may be useful during testing:
             # -Include cluster name when enriching the log data-
             # Documentation on fd limits, etc. for ifsnotify and the like
             # -Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.-
             # -Provide full integration tests with CI-
             # -Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112-
             # Look to see if we can provide counters for various errors and/or prometheus metrics (optionally): coming soon in FB but also see [https://github.com/neiman-marcus/fluent-bit-out-prometheus-metrics
             # -Add Docker-compose stack as an example for local usage.-
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
             # -Reduce cyclometric complexity and refactor watcher to simplify, standardise logging as per operator too.-
             # -Add unit tests for watcher functionality - all covered by integration tests currently so shift left if possible.-
             # Document GKE set up - issues with Autopilot and Promtail. Stalling of loki input. Working now it seems without PV: [https://github.com/patrick-stephens/couchbase-gitops/tree/main/gke
             # -Rewrite the 4 letter level names for java logs. Good example of sorting case out as well: [https://github.com/sassoftware/viya4-monitoring-kubernetes/blob/eaaf0498f835cbabbcf9f55715ddeafae2d68ca5/logging/fb/fluent-bit_config.configmap_open.yaml#L731]- 
             # Ensure we test mount path changes, i.e. that we pick up the config from there/mount it in and can watch for changes.
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Description Various improvements found that may be useful during testing:
             # -Include cluster name when enriching the log data-
             # Documentation on fd limits, etc. for ifsnotify and the like
             # -Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.-
             # -Provide full integration tests with CI-
             # -Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112-
             # Look to see if we can provide counters for various errors and/or prometheus metrics (optionally): coming soon in FB but also see [https://github.com/neiman-marcus/fluent-bit-out-prometheus-metrics
             # -Add Docker-compose stack as an example for local usage.-
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
             # -Reduce cyclometric complexity and refactor watcher to simplify, standardise logging as per operator too.-
             # -Add unit tests for watcher functionality - all covered by integration tests currently so shift left if possible.-
             # Document GKE set up - issues with Autopilot and Promtail. Stalling of loki input. Working now it seems without PV: [https://github.com/patrick-stephens/couchbase-gitops/tree/main/gke
             # -Rewrite the 4 letter level names for java logs. Good example of sorting case out as well: [https://github.com/sassoftware/viya4-monitoring-kubernetes/blob/eaaf0498f835cbabbcf9f55715ddeafae2d68ca5/logging/fb/fluent-bit_config.configmap_open.yaml#L731]- 
             # Ensure we test mount path changes, i.e. that we pick up the config from there/mount it in and can watch for changes.
            Various improvements found that may be useful during testing:
             # -Include cluster name when enriching the log data-
             # Documentation on fd limits, etc. for ifsnotify and the like
             # -Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.-
             # -Provide full integration tests with CI-
             # -Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112-
             # -Look to see if we can provide counters for various errors and/or prometheus metrics (optionally): coming soon in FB but also see [https://github.com/neiman-marcus/fluent-bit-out-prometheus-metrics]- 
             # -Add Docker-compose stack as an example for local usage.-
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
             # -Reduce cyclometric complexity and refactor watcher to simplify, standardise logging as per operator too.-
             # -Add unit tests for watcher functionality - all covered by integration tests currently so shift left if possible.-
             # Document GKE set up - issues with Autopilot and Promtail. Stalling of loki input. Working now it seems without PV: [https://github.com/patrick-stephens/couchbase-gitops/tree/main/gke
             # -Rewrite the 4 letter level names for java logs. Good example of sorting case out as well: [https://github.com/sassoftware/viya4-monitoring-kubernetes/blob/eaaf0498f835cbabbcf9f55715ddeafae2d68ca5/logging/fb/fluent-bit_config.configmap_open.yaml#L731]- 
             # Ensure we test mount path changes, i.e. that we pick up the config from there/mount it in and can watch for changes.
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Component/s logging [ 16330 ]
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Link This issue relates to K8S-2171 [ K8S-2171 ]
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Link This issue relates to K8S-2172 [ K8S-2172 ]
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Assignee Patrick Stephens [ JIRAUSER25332 ] Roo Thorp [ JIRAUSER25108 ]
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Fix Version/s 2.3.0 [ 17600 ]
            Fix Version/s not-targeted [ 16613 ]
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Description Various improvements found that may be useful during testing:
             # -Include cluster name when enriching the log data-
             # Documentation on fd limits, etc. for ifsnotify and the like
             # -Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.-
             # -Provide full integration tests with CI-
             # -Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112-
             # -Look to see if we can provide counters for various errors and/or prometheus metrics (optionally): coming soon in FB but also see [https://github.com/neiman-marcus/fluent-bit-out-prometheus-metrics]- 
             # -Add Docker-compose stack as an example for local usage.-
             # Use bats-test possibly to provide a framework to run various tests better than the current single script (and output in different formats, etc.)
             # -Reduce cyclometric complexity and refactor watcher to simplify, standardise logging as per operator too.-
             # -Add unit tests for watcher functionality - all covered by integration tests currently so shift left if possible.-
             # Document GKE set up - issues with Autopilot and Promtail. Stalling of loki input. Working now it seems without PV: [https://github.com/patrick-stephens/couchbase-gitops/tree/main/gke
             # -Rewrite the 4 letter level names for java logs. Good example of sorting case out as well: [https://github.com/sassoftware/viya4-monitoring-kubernetes/blob/eaaf0498f835cbabbcf9f55715ddeafae2d68ca5/logging/fb/fluent-bit_config.configmap_open.yaml#L731]- 
             # Ensure we test mount path changes, i.e. that we pick up the config from there/mount it in and can watch for changes.
            Various improvements found that may be useful during testing:
             # -Include cluster name when enriching the log data-
             # -Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.-
             # -Provide full integration tests with CI-
             # -Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112-
             # -Look to see if we can provide counters for various errors and/or prometheus metrics (optionally): coming soon in FB but also see [https://github.com/neiman-marcus/fluent-bit-out-prometheus-metrics]- 
             # -Add Docker-compose stack as an example for local usage.-
             # -Reduce cyclometric complexity and refactor watcher to simplify, standardise logging as per operator too.-
             # -Add unit tests for watcher functionality - all covered by integration tests currently so shift left if possible.-
             # -Document GKE set up - issues with Autopilot and Promtail. Stalling of loki input. Working now it seems without PV: [https://github.com/patrick-stephens/couchbase-gitops/tree/main/gke]- 
             # -Rewrite the 4 letter level names for java logs. Good example of sorting case out as well: [https://github.com/sassoftware/viya4-monitoring-kubernetes/blob/eaaf0498f835cbabbcf9f55715ddeafae2d68ca5/logging/fb/fluent-bit_config.configmap_open.yaml#L731]- 
             # Ensure we test mount path changes, i.e. that we pick up the config from there/mount it in and can watch for changes.
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Link This issue relates to K8S-2324 [ K8S-2324 ]
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Resolution Done [ 6 ]
            Status In Progress [ 3 ] Resolved [ 5 ]
            patrick.stephens Patrick Stephens (Inactive) made changes -
            Description Various improvements found that may be useful during testing:
             # -Include cluster name when enriching the log data-
             # -Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.-
             # -Provide full integration tests with CI-
             # -Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112-
             # -Look to see if we can provide counters for various errors and/or prometheus metrics (optionally): coming soon in FB but also see [https://github.com/neiman-marcus/fluent-bit-out-prometheus-metrics]- 
             # -Add Docker-compose stack as an example for local usage.-
             # -Reduce cyclometric complexity and refactor watcher to simplify, standardise logging as per operator too.-
             # -Add unit tests for watcher functionality - all covered by integration tests currently so shift left if possible.-
             # -Document GKE set up - issues with Autopilot and Promtail. Stalling of loki input. Working now it seems without PV: [https://github.com/patrick-stephens/couchbase-gitops/tree/main/gke]- 
             # -Rewrite the 4 letter level names for java logs. Good example of sorting case out as well: [https://github.com/sassoftware/viya4-monitoring-kubernetes/blob/eaaf0498f835cbabbcf9f55715ddeafae2d68ca5/logging/fb/fluent-bit_config.configmap_open.yaml#L731]- 
             # Ensure we test mount path changes, i.e. that we pick up the config from there/mount it in and can watch for changes.
            Various improvements found that may be useful during testing:
             # -Include cluster name when enriching the log data-
             # -Reduce the _tail_ refresh interval (currently 60 seconds) so it picks up logs sooner - the container starts quickly but if the log directory or rebalance is not present then can take a while once server starts so we may lose logs on a quick failure.-
             # -Provide full integration tests with CI-
             # -Default Loki output - need to make sure no impact on customer usage, ideally a simple method to enable during testing but with a managed config. Relates to K8S-2112-
             # -Look to see if we can provide counters for various errors and/or prometheus metrics (optionally): coming soon in FB but also see [https://github.com/neiman-marcus/fluent-bit-out-prometheus-metrics]- 
             # -Add Docker-compose stack as an example for local usage.-
             # -Reduce cyclometric complexity and refactor watcher to simplify, standardise logging as per operator too.-
             # -Add unit tests for watcher functionality - all covered by integration tests currently so shift left if possible.-
             # -Document GKE set up - issues with Autopilot and Promtail. Stalling of loki input. Working now it seems without PV: [https://github.com/patrick-stephens/couchbase-gitops/tree/main/gke]- 
             # -Rewrite the 4 letter level names for java logs. Good example of sorting case out as well: [https://github.com/sassoftware/viya4-monitoring-kubernetes/blob/eaaf0498f835cbabbcf9f55715ddeafae2d68ca5/logging/fb/fluent-bit_config.configmap_open.yaml#L731]- 
             # Ensure we test mount path changes, i.e. that we pick up the config from there/mount it in and can watch for changes: K8S-2324
            roo.thorp Roo Thorp made changes -
            Status Resolved [ 5 ] Closed [ 6 ]

            People

              roo.thorp Roo Thorp
              patrick.stephens Patrick Stephens (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty