Loading...

XML

Word

Printable

Details

Type: Task
Resolution: Fixed
Priority: Major
Fix Version/s: 7.0.0
Affects Version/s: Cheshire-Cat
Component/s: XDCR
Labels:
None

Epic Link:
XDCR Collections and scope support

Description

Tracking for the ability for XDCR to receive collection-based events (i.e. creation/drop), mutations, and forward to the corresponding collection-enabled target bucket.

Scenario 1 Happy Path – let’s start out with the happy path so you can follow the collections routing principle.

Prerequisites:

Both Source and Target are collection aware
1 replication created between Source bucket (B1) and Target Bucket (B2) have the same scope and collection layout for implicit collection (i.e. all mappings are valid)
Collections Manifest Service is functional

Pipeline Creation:

Pipeline manager tells XDCR Factory to create a new pipeline.
XDCR Factory constructs the router. Each router is going to be attached directly to one DCP nozzle. Each router also knows which XMEM nozzles to which the DCP nozzle should be mapped to. This is called downstreamParts.
*New* For each XMEM nozzle (downstreamParts), create a “CollectionsRouter” in front of it. The CollectionsRouter’s job is to map the incoming DCP packet to the right collection ID on the target (and handle simple errors)

The whole pipeline is created and ready to start

Pipeline Start and Data flows:

Pipeline Start()’s and DCP opens up UprStream with collection enabled.
In this example, let’s say the user create a collection name C1, ID is 1. User then writes a mutation into [CID 1].
Manifest ID 1 is associated with this operation. [MID 1]

Before sending a mutation with a collection ID embedded inside, DCP will send down a system event that says “KV knows about [MID 1] and future muts are based on this knowledge”.
This system event is sent throughout all 1024 vb’s. Some VBs get it earlier, some get it later, depending on how fast DCP nozzle is operating

DCP will receive the system event per VB.
On a Per VB basis, keep track of “the highest MID received”.
Mark the system event “mutation” as processed so stats and throughseq number is correct.

After sending down the system event, KV sends down the mutation user wrote in step 2.
DCP decodes the mutation and figures out a CID of 1 of it.
Based on the “latest manifest” it received (4a), queries the collection manifest service for that manifest.
Looks up CID 1 and finds that it is associated with the name “C1”

Router wraps it and routes and passes it to the *new* CollectionRouter that is sitting in front of each XMEM
The collectionRouter’s RouteCollection() will then do the following:
Lookup the latest target manifest available to XDCR’s Collection Manifest Service
Finds out the CID of the target collection “C1”, writes it into the Key

Send it out via XMEM.

Potential Error Scenarios

XDCR Pulls Target Manifest lags behind the DCP
Intentional “broken” implicit mapping where user did not create correspondingly named collection “c1” on target.
Target Node ns_server has the manifest but target node’s KV does not (TODO: ~~MB-38023~~)

Current Error Handling Algorithm

As XDCR sits on the source side, it has no idea between the situations 1 and 2 above.

So the error handling algorithm proposed here should tackle both situations in the same way, and then once we know 1 from 2, perform recovery action.

Prerequisite:

Given a source collection name, “C1”, pull the *latest* manifest that XDCR knows of from the target (i.e. MID 5).
Each router collector has an internal map that is called “Broken Mapping”. Right now, it is empty.
The broken mapping says “Source C1 mapping to target C1” is broken.

Right now, it works for implicit mapping. For explicit mapping, this can be enhanced.

Collection Router looks to see if “C1” is in the broken mapping. It is not.
Try to see if the target’s MID 5 contains a collection name “C1”. At this time, assume not. Thus, begins the error handling procedure.

Error Handling Procedure:

Each Collection Router has a “retryQueue”. It is essentially a FIFO buffer and has a size limit.
Each collection router launches a “runRetry()” go-routine that will periodically try to re-map the failed mutations again to the latest target manifest.
If each mutation fails to be mapped after a certain number of times, then that mutation’s mapping is considered “broken”.
If the “retryQueue” buffer is full and one mutation needs to be put into the buffer, then the oldest item in the buffer is essentially “bumped out” and retried by force before the periodic go-routine in 1a gets to retry.

If the “bumped out” mutation fails to be mapped, and because the retryQueue is full, it will immediately be marked “broken”, bypassing 1b.
This is to handle a burst of traffic coming from DCP and prevent it from overwhelming the retry system.

When the mapping failed in Prereq 4, it is put into the retry queue.
If the retry succeeds, then all is well.
If retry failed after Max times, then declare the mapping broken.
Future mutations that has source collection “C1” will automatically be ignored and bypassing the retry mechanism all together.
Note – explicit mapping is more specific and will be addressed later.

This only tells half the story. The second half of the story is *how* the recovery mechanism would work and how the system can get out of this situation.

There are two recovery mechanisms: Manifest-Service based recovery and I/O based recovery.

Manifest-Service Based Recovery: (NOTE this is not done yet. This is part of the Backfill Manager phase 2)

When there is no I/O going on, manifest-service based recovery would take place.
An example of this is if all mutations have been retried and marked broken based on a target MID of 5, then the target collection is created, with MID of 6.

Collections Manifest Service will periodically pull the target, and sees a new MID of 6, which is newer than MID 5.
It will find the diff between MID 5 and 6, sees that target “C1” now exists. It will then create a backfill request.
This could be racy between this and the I/O recovery path. Will need to think more when working on this. For example, wait a while for I/O based recovery to kick in first before actually needing to create backfill replications.

I/O Based Recovery:

The situation now is that there is a bunch of mutations that were marked broken based off of MID 5, because target MID 5 does not contain “C1”. Currently, any mutation coming from source of C1 is ignored.
XDCR collections manifest service pulls the latest target manifest in the background, got MID 6.
Right now, collection routers’ broken maps contain “C1”, but not for other mutations. In this example, assume “C2” has already existed in MID 5 and any “C2” mutations have been mapping correctly all this time.
The next “C2” mutation that comes through the pipeline, collections router will detect that the latest target manifest is now MID 6 and not 5. Even though “C2” is not affected, it will take this opportunity to see if the elements in its broken map are no longer broken.
It does a diff between MID 5 and 6.

Sees that “C1” now exists in MID 6, and that “C1” was marked broken.
Declares that “C1” is no longer broken. Future “C1” mutations coming down can now pass through and be mapped.

iii. TODO – need to create a backfill replication for all previously missed C1 mutations. (Part of Backfill Milestone)

Application:

Now, knowing the algorithm and recovery path, let’s see how it applies to our original error conditions.

Scenario 1 Not-Happy Path – XDCR has not pulled the latest Target Manifest

If XDCR pulls the latest target manifest during the retry interval (i.e. no broken mapping declared), then the data will be retried and written successfully to target.
If broken mapping has been declared, then future “C1” writes will:
Depend upon other collections to rescue the broken mapping if I/O is still ongoing. OR
Wait for system’s collection manifest service to pull the latest target manifest that contains “C1”, and create backfill requests.

If there is a gigantic flood of data coming in from DCP and overwhelms the retry buffer:
Then the retry mechanism won’t occur and any mutation with “C1” will immediately declared broken mapped. Then, depend upon 2a/2b to rescue it.

Scenario 2 Intentional “broken” implicit mapping where user did not create correspondingly named collection “c1” on target.

Retry mechanism or overflow mechanism will mark the collection “C1” mapping broken. Saving wasted resource on future “C1” mutations.
Same recovery applies if the user then creates “C1” on target.

Attachments

Issue Links

blocks

MB-37984 Collections Routing Performance improvements

Closed

MB-38021 Collections Replication Pipeline Pause/Resume checkpointing

Closed

MB-38023 Handle race condition when Target ns_server is aware of collection but KV isn't

Closed

MB-38086 Ensure backwards compatibility with legacy targets without collections

Closed

MB-38381 Ensure DCP's composeWrappedUprEvent is fail-safe

Closed

depends on

MB-38060 Framework for Disable collections when replicating to an older target

Closed

(1 depends on)

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Neil Huang

Reporter:: Neil Huang

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 19/Feb/20 9:19 AM

Updated:: 17/Jun/21 4:21 PM

Resolved:: 20/Mar/20 3:22 PM

Gerrit Reviews

There are no open Gerrit changes

Show There are 2 closed Gerrit changes

Hide There are 2 closed Gerrit changes

MB-37983 - Barebone collection implicit mapping support: Gerrit Review:

MB-37983 - Barebone collection implicit mapping support: Gerrit Review:

Basic collection routing framework

Details

Description

Attachments

Issue Links

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty