Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-33355

[CX] GC overhead limit exceeded on CC after 42.5 hours of constant node rejoins

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • 6.5.0
    • 6.5.0
    • analytics
    • Untriaged
    • Unknown
    • CX Sprint 144

    Description

      Observed in MB-32921 logs, the NC Is unable to decrypt its key due to the openssl env issue. As a result, the NC crashes while attempting to join the cluster, getting into a restart loop. After 42+ hours of (12K+) repeated rejoins, the CC exits with OOM: GC overhead limit exceeded.

      Each register loop looks like this on the CC:

      2019-02-03T01:32:39.545-08:00 WARN CBAS.cluster.NodeManager [Worker:ClusterController] +addNode: 0d54f53fda3750763cc6e6518cf63ece
      2019-02-03T01:32:39.545-08:00 WARN CBAS.cluster.NodeManager [Worker:ClusterController] Node '0d54f53fda3750763cc6e6518cf63ece' is already registered; failing the node then re-registering.
      2019-02-03T01:32:39.545-08:00 INFO CBAS.cluster.NodeManager [Worker:ClusterController] 0d54f53fda3750763cc6e6518cf63ece considered dead
      2019-02-03T01:32:39.545-08:00 INFO CBAS.bootstrap.ClusterLifecycleListener [Worker:ClusterController] NC: 0d54f53fda3750763cc6e6518cf63ece left
      2019-02-03T01:32:39.545-08:00 INFO CBAS.utils.ClusterStateManager [Worker:ClusterController] Removing configuration parameters for node id 0d54f53fda3750763cc6e6518cf63ece
      2019-02-03T01:32:39.545-08:00 INFO CBAS.utils.ClusterStateManager [Worker:ClusterController] ignoring update to same cluster state of ACTIVE
      2019-02-03T01:32:39.545-08:00 INFO CBAS.cluster.NodeManager [Worker:ClusterController] adding node to registry
      2019-02-03T01:32:39.545-08:00 INFO CBAS.cluster.NodeManager [Worker:ClusterController] updating cluster capacity
      2019-02-03T01:32:39.545-08:00 INFO CBAS.work.RegisterNodeWork [Worker:ClusterController] registered node: 0d54f53fda3750763cc6e6518cf63ece
      2019-02-03T01:32:39.545-08:00 INFO CBAS.bootstrap.ClusterLifecycleListener [Worker:ClusterController] NC: 0d54f53fda3750763cc6e6518cf63ece joined
      2019-02-03T01:32:39.545-08:00 INFO CBAS.utils.ClusterStateManager [Worker:ClusterController] Registering configuration parameters for node id 0d54f53fda3750763cc6e6518cf63ece
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              murtadha.hubail Murtadha Hubail
              michael.blow Michael Blow
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty