Uploaded image for project: 'Couchbase Elasticsearch Connector'
  1. Couchbase Elasticsearch Connector
  2. CBES-110

Need document routing to support join

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 4.0.0
    • Fix Version/s: 4.0.1
    • Labels:
      None

      Description

      This version does not support document routing to support joins. Currently to be able to do this, I have to use the ElasticSearch API to place the Child Docs into the Index, rather than create Couchbase Documents and push them through the Connector

        Attachments

          Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

            Activity

            Hide
            david.nault David Nault added a comment -

            Because the DCP protocol used by the connector does not include document bodies in deletion notifications, it's not easy to know where to route the deletion requests.

            See discussion over at https://github.com/couchbase/couchbase-elasticsearch-connector/pull/203 where one of the proposals was to support routing for a document type only when the connector is configured to ignore deletions for that type.

            Another caveat is that because each virtual bucket is replicated independently, in general there's no guarantee that two different documents will be indexed in Elasticsearch in the same order they were written to Couchbase (child might be indexed before parent if they are in different vbuckets, for example).
            I haven't experimented with parent/join enough to know whether that's likely to be a problem.

            @envitraux If you could specify a routing but not delete the routed documents, and the parent and children could be indexed in any order, would that satisfy your use case?

            Show
            david.nault David Nault added a comment - Because the DCP protocol used by the connector does not include document bodies in deletion notifications, it's not easy to know where to route the deletion requests. See discussion over at https://github.com/couchbase/couchbase-elasticsearch-connector/pull/203 where one of the proposals was to support routing for a document type only when the connector is configured to ignore deletions for that type. Another caveat is that because each virtual bucket is replicated independently, in general there's no guarantee that two different documents will be indexed in Elasticsearch in the same order they were written to Couchbase (child might be indexed before parent if they are in different vbuckets, for example). I haven't experimented with parent/join enough to know whether that's likely to be a problem. @envitraux If you could specify a routing but not delete the routed documents, and the parent and children could be indexed in any order, would that satisfy your use case?
            Hide
            envitraux envitraux added a comment - - edited

            In my use case, the parent is usually in the bucket long before the children. And usually the parent is never deleted. A simple route would do for this.

            As I noted in GitHub "I would have loved this feature too. I got around the problem by writing the child documents directly to ElasticSearch using their API client. I know the ElasticSearch people are not keen on Joins either, but it is hard denormalize a relationship where there is one parent for thousands of children. In an interesting note. The ES API has some nice features to do bulk indexing and deletes".

            I do know the old connector had a problem sometime writing the child docs before the parent docs causing issues. This only came when I had to rebuild the index from scratch and direct from Couchbase.

             

            Show
            envitraux envitraux added a comment - - edited In my use case, the parent is usually in the bucket long before the children. And usually the parent is never deleted. A simple route would do for this. As I noted in GitHub "I would have loved this feature too. I got around the problem by writing the child documents directly to ElasticSearch using their API client. I know the ElasticSearch people are not keen on Joins either, but it is hard denormalize a relationship where there is one parent for thousands of children. In an interesting note. The ES API has some nice features to do bulk indexing and deletes". I do know the old connector had a problem sometime writing the child docs before the parent docs causing issues. This only came when I had to rebuild the index from scratch and direct from Couchbase.  
            Hide
            david.nault David Nault added a comment -

            Partial solution (without deletion) will be included in next release.

            Show
            david.nault David Nault added a comment - Partial solution (without deletion) will be included in next release.
            Hide
            envitraux envitraux added a comment -

            Awesome. I see 4.0.1 is out. I will give it a go

            Show
            envitraux envitraux added a comment - Awesome. I see 4.0.1 is out. I will give it a go

              People

              • Assignee:
                david.nault David Nault
                Reporter:
                envitraux envitraux
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Gerrit Reviews

                  There are no open Gerrit changes

                    PagerDuty

                    Error rendering 'com.pagerduty.jira-server-plugin:PagerDuty'. Please contact your Jira administrators.