Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-55352

XDCR - backfill replication unable to commit

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • 7.1.0, 7.1.1, 7.1.2, 7.1.3
    • XDCR
    • None
    • Untriaged
    • 0
    • Unknown

    Description

      (Scope and collection names have been redacted)

      ongoingReplMap=map[08583e4321465ae56cf1ad396261d002/onebridge/onebridge:true backfill_08583e4321465ae56cf1ad396261d002/onebridge/onebridge:true c57dcec1a8d912942a529639cbb2cd9d/onebridge/onebridge:true]
      2023-01-19T16:49:05.325Z ERRO GOXDCR.BackfillMgr: c57dcec1a8d912942a529639cbb2cd9d/onebridge/onebridge experienced error when persisting (type 1) - requested resource not found
      2023-01-19T16:49:05.325Z ERRO GOXDCR.BackfillMgr: Retrying job {map[] map[...| ] 74} failed due to requested resource not found - will try again next cycle
      

      There seems to be a race condition, where the backfill spec was supposed to exist so that more tasks can be added (thus the appended message above, and that type 1 is of "Set" instead of "Add" to a metakv key).
      However, if a deletion takes place, the append operation will fail as there is now nothing to append to.

      The presence of these mean that the backfill probably didn't complete successfully, and thus missing mutations. If any nodes are restarted at this point, there will be missing backfill that never recovers.

      More investigation is needed to find out why and how the system can get into this state, and how to recover.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              neil.huang Neil Huang
              neil.huang Neil Huang
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty