When XDCR is cold stopped and restarted, it reloads all metadata from metakv, such as replication spec. With 7.0, it also loads backfill replication specs.
Backfill replication specs, and compressed backfill mapping files all live under the same "directory" under metakv. The current logic will try to load a spec only once by checking a specific prefix. If the prefix has been processed, it'll skip onto the next "backfill spec".
There is a mistake in the reloading logic where the backfill replication service doesn't check if the loaded metakv doc is a backfill mapping doc instead of a backfill replication spec doc. This will cause the reloading mechanism to try to unmarshal the backfill mapping doc into the backfill replication spec, which will not return an error and thus fail silently. The backfill spec "reloaded" will have been empty and no tasks will be launched.
This will cause potential data loss (if goxdcr happens to be bounced/restarted)
|For Gerrit Dashboard: MB-47778|
|158954,2||MB-47778 - backfill spec service may not load backfill spec correctly when cold starting||cheshire-cat||goxdcr||Status: MERGED||+2||+1|