The ES plugin was failing to specify a route when deleting children, because this info is not easily accessible (it's not included in the CAPI request).
The solution has two parts:
- When using RegexParentSelector, the parent ID is already embedded in the child ID. In that case, the routing can be inferred directly from the document ID.
- When using DefaultParentSelector, a shadow document called a "routing signpost" is created for each child document. When deleting a child, the plugin discovers the correct routing by reading the contents of the signpost document.
Some caveats apply when using DefaultParentSelector:
- Child documents created by previous versions of the plugin will not be eligible for deletion, since they do not have routing signposts.
- Failure to delete a child due to missing routing signpost is not treated as a fatal error. A warning is logged and replication continues. This is because the signpost and the document it refers to cannot be deleted in a single atomic operation. It's possible for the signpost to be successfully deleted while the child deletion fails. When the replication attempt is repeated, the signpost document will no longer exist, and it will no longer be possible to route the child deletion request correctly.
- The routing signpost may live on a different Elasticsearch shard than the document it refers to. If the signpost's shard suffers data loss, it may become impossible for the plugin to delete the child document.
- Loading the signpost document will trigger an Elasticsearch index refresh if the child document was both created and deleted within the same index refresh interval. It's fine if this happens occasionally, but constant rapid-fire creation and deletion may cause Elasticsearch performance issues associated with too-frequent refreshing.
The alternative approach of using a query to locate the child documents (instead of using signpost documents) was considered, but it suffers from issues with document visibility. Documents only appear in query results after they have been indexed, and it's not practical to wait, or to force a refresh every time a document is deleted. The signposts, on the other hand, can be fetched by ID with a multi-get request regardless of whether they have been indexed yet.