Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-61951

[BP 7.6.2] XDCR - RemClusterSvc loop gets stuck due to stale entry from httpsHostName

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 7.6.2
    • 7.6.0, Morpheus, 7.1.4, 7.1.0, 7.1.1, 7.1.2, 7.2.0, 7.1.3, 7.2.1, 7.1.5, 7.2.4, 7.2.2, 7.1.6, 7.2.3, 7.2.5, 7.6.2, 7.6.1
    • XDCR
    • Untriaged
    • 0
    • Unknown

    Description

      When none of the nodes are reachable like error below

      2024-05-17T14:36:38.595+01:00 ERRO GOXDCR.RemClusterSvc: Failed to refresh remote cluster reference remoteCluster/nZiKS9FBRB-43UdxHgmPUFTdvvWAiVgvYx7jj0BkQl0= since none of the nodes in target node list is accessible. node list = [[cb.l0payzmthkuawjsw.aws-guardians.nonprod-project-avengers.com:8091 abcd:18091]]
      2024-05-17T14:36:38.595+01:00 WARN GOXDCR.RemClusterSvc: Agent  periodic refresher encountered error while doing a refresh: Refresh operation could not contact any node in the node list

      the loop is stuck in:
      ErrorRefreshUnreachable -> Make nodelist, activeHostNames empty -> new iteration of loop -> Refresh context init -> Populate cachedRefNodesList with Hostname() & HttpsHostName()

      The httpsHostName used is when remote reference was created and it never got updated. So cachedRefNodesList ends being populated by stale entry.

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-61951
          # Subject Branch Project Status CR V

          Activity

            People

              ayush.nayyar Ayush Nayyar
              sudeep.jathar Sudeep Jathar
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty