Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-26074

DCP may needlessly be telling some clients about deletions.

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Unresolved
    • Major
    • feature-backlog
    • master
    • couchbase-bucket
    • None

    Description

      Consider a vbucket with 0 curr_items, but really you had stored n items then deleted every item.

      Now when a client wants to build a copy of the vbucket they will open a DCP stream and ask for everything, that is startseqno:0, endseqno:-1. Now what will DCP send for this empty vbucket? The answer is n DCP_DELETION messages.

      The reason this occurs is because DCP gives the client the historical data by iterating the by-seqno index, returning all keys in seqno order. We use a couchstore function that optionally allows for the by-seqno iterate to include deleted keys, we have that option enabled hence we will return n deleted keys.

      This actually seems correct, certainly when you consider that DCP was orginally created so we could build replica vbuckets for HA of the data. The case I see is that XDCR is allowed to read deleted keys using GetMeta and thus the replica not only needs to store 'live' keys but also deletions, even if that vbucket logically has no items, XDCR needs to be able to fail over to a promoted replica and see the same deleted keys (tombstones).

      But what about all the other users of DCP we now have? It seems very unlikely that for the scenario I describe that a FTS client wants to know about deletes if it was never told about the key in the first place.

      For this MB we should consider if it's possible to have add a client-option to DCP so a client can indicate how they want to receive the backfill deletion. Thus a pure replica can get all of the deletions (just like today) whilst a different client backfills only 'live' documents, i.e. documents with bodies (system xattrs muddys the water a little, but we already have client opt-in support for xattr values on deleted keys).

      • Q: I wonder if this becomes as simple as mapping the client's option to switching the couchstore flag which allows for deleted keys vs no deleted keys?

      Now this MB also relates to Collection (MB-16181) and was created after experimenting with collection deletion and DCP.

      With collections it is a requirement that when a collection is deleted all items of the collection are automatically deleted by the system. Secondly it is desired (not a requirement) that DCP replicates a collection delete by only signalling a collection was deleted (with the name) and that the item deletions to not generate DCP_DELETION messages, we believe a client can do the deletions with just the initial collection-deleted message.

      However the above assumptions muddy this desire somewhat in a few ways which is being initially brain-dumped here.

      1. In the MB's initial scenario, the replica building example states that n DCP_DELETIONS are sent to the replica, which our collections desire is to avoid. Given the tombstone fail-over assumption, any KV-engine replica needs to have all of the deletes of the active and I would believe the deletes must be at the same seqno - basically it's hard to guarantee that if the active and replica vbuckets are independently searching for victim keys and generating deletes, that those delete will be identical (i.e. key1 is deleted with the same seqno on active and replica). Can this assumption be broken?
      2. When a collection is deleted, we asynchronusly delete the items, thus items of the collection exist in the hash-table/disk that are logically deleted and are awaiting physical deletion. Now as part of the desire with collections to hide collection deletions from DCP we want to treat logically deleted keys the same as real deleted keys, thus if backfill found a collection
        key and that key is logically deleted, it should not replicate. I can't help but think that if we skipped telling a real replica about a real key, even though it is pending deletion there's not a tricky corner case with failover.

      In summary with the collections aspect of this MB i'm seeing that the proposed client 'pure replica vs non-pure' option affects collections as such.

      1. Real replicas have nothing hidden from them (they don't turn on the new option), if you delete a massive collection we replicate all the deletes to the replica, thus the replica is identical, even logically deleted keys get sent as DCP_MUTATIONs.
      2. Clients who don't need that 100% pure replica aspect, they just need the live data in seqno order, collections could work as follows.
        1. These clients don't get told about deletions when backfilling
        2. These clients don't get told about logically deleted items when backfilling

      Basically these clients opting in to 'less-data' only get told that a collection has been deleted and get none of the deletions whether backfilling or in-memory streaming.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              jwalker Jim Walker
              jwalker Jim Walker
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty