Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-53158

Docs are not replicating between mixed mode XDCR after one DC upgrade from 6.5.1 to 7.0.3

    XMLWordPrintable

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 7.0.3
    • Morpheus
    • XDCR
    • CBS-7.0.3-15230
      CBS 6.5.1-6299
    • Untriaged
    • Centos 64-bit
    • 1
    • No

    Attachments

      1. Doc_content_on_DC_1.png
        Doc_content_on_DC_1.png
        510 kB
      2. Doc_content_on_DC_2.png
        Doc_content_on_DC_2.png
        607 kB
      3. Doc_metadata_on_DC_1.png
        Doc_metadata_on_DC_1.png
        362 kB
      4. Doc_metadata_on_DC_2.png
        Doc_metadata_on_DC_2.png
        534 kB
      5. doc revid on 6.5.1 cluster.png
        doc revid on 6.5.1 cluster.png
        1.48 MB
      6. doc revid on upgraded cluster.png
        doc revid on upgraded cluster.png
        1.24 MB
      7. Screen Shot 2022-07-28 at 3.41.01 PM.png
        Screen Shot 2022-07-28 at 3.41.01 PM.png
        93 kB
      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

      Activity

        lilei.chen Lilei Chen added a comment - - edited

        Hemant Rajput Are the clusters still there? I connected to http://172.23.104.197:8091 and http://172.23.105.15:8091 to take a look. Seems like replications are up to date. The logs from data service nodes confirm the same. I tried to access the document and both clusters have the same:

        lilei.chen Lilei Chen added a comment - - edited Hemant Rajput Are the clusters still there? I connected to http://172.23.104.197:8091 and http://172.23.105.15:8091 to take a look. Seems like replications are up to date. The logs from data service nodes confirm the same. I tried to access the document and both clusters have the same:
        lilei.chen Lilei Chen added a comment -

        Hemant Rajput Please assign it back to me if you can bring the clusters to the failure state.

        lilei.chen Lilei Chen added a comment - Hemant Rajput Please assign it back to me if you can bring the clusters to the failure state.

        Lilei Chen, clusters are up and I see the content are not synced.

        DataCentre 1 -  http://172.23.104.197:8091

         

        DataCentre 2 -  http://172.23.105.15:8091

         

         

        As you can see the rev for doc "A_test_doc_0" is 44-xxx on DC1 and 53-xxx on DC2. Also the content is also different.

         

         

        hemant.rajput Hemant Rajput added a comment - Lilei Chen , clusters are up and I see the content are not synced. DataCentre 1 -   http://172.23.104.197:8091   DataCentre 2 -  http://172.23.105.15:8091     As you can see the rev for doc "A_test_doc_0" is 44-xxx on DC1 and 53-xxx on DC2. Also the content is also different.    
        lilei.chen Lilei Chen added a comment -

        Hemant Rajput 

        I noticed that in metadata output, for data center 1, you have id="A_test_doc_0", for data center 2, you have id="\u0000A_test_doc_0" (with leading \u0000).

        I was using Google Chrome and the only way for me to access the document is to search by name, and the output id is always "A_test_doc_0" and they have the same content and revId (44) for both data centers.

        I switched to Firefox. For data center 1, the documents don't show up by default so I search the document by name and got revId 44. For data center 2, the list of documents show up by default. If I click on "A_test_doc_0", I see the id as "\u0000A_test_doc_0" and revId=55. If I search by name to received the document, I got revId=44 and without the leading "\0000"

        I am not sure what's going on in KV or UI with the leading \u0000. I ran the xdcrDiffer tool, which gets documents through DCP from both data centers, dedup and compare, and it found 0 difference. So replication is correct here.

        lilei.chen Lilei Chen added a comment - Hemant Rajput   I noticed that in metadata output, for data center 1, you have id="A_test_doc_0", for data center 2, you have id="\u0000A_test_doc_0" (with leading \u0000). I was using Google Chrome and the only way for me to access the document is to search by name, and the output id is always "A_test_doc_0" and they have the same content and revId (44) for both data centers. I switched to Firefox. For data center 1, the documents don't show up by default so I search the document by name and got revId 44. For data center 2, the list of documents show up by default. If I click on "A_test_doc_0", I see the id as "\u0000A_test_doc_0" and revId=55. If I search by name to received the document, I got revId=44 and without the leading "\0000" I am not sure what's going on in KV or UI with the leading \u0000. I ran the xdcrDiffer tool, which gets documents through DCP from both data centers, dedup and compare, and it found 0 difference. So replication is correct here.

        the leading \u0000 is because we have added the docs with null prefix to reproduce the UPS issue - https://issues.couchbase.com/browse/CBSE-12281

         

        If the content for doc "A_test_doc_0" is same in both DC then why don't we have the doc with the same name with null prefix in DC1?

         

        hemant.rajput Hemant Rajput added a comment - the leading \u0000 is because we have added the docs with null prefix to reproduce the UPS issue - https://issues.couchbase.com/browse/CBSE-12281   If the content for doc "A_test_doc_0" is same in both DC then why don't we have the doc with the same name with null prefix in DC1?  
        lilei.chen Lilei Chen added a comment -

        If the content for doc "A_test_doc_0" is same in both DC then why don't we have the doc with the same name with null prefix in DC1?

        We know A_test_doc_0 is the same on both DCs, with revId 44. "\u0000A_test_doc_0" is a different document.

        For DC1, we can't find the null prefix document through UI. For DC2,  I tried and couldn't find it on through searching by doc ID on UI or through REST, even though it is listed in document list if you happen to use Firefox. There must be some something weird about null prefix document that UI/REST don't like.

        As far as XDCR is concerned, null prefixed documents are handled the same as any other documents. We don't modify doc Id. xdcrDiffer reports that there is no diff between the buckets.

        lilei.chen Lilei Chen added a comment - If the content for doc "A_test_doc_0" is same in both DC then why don't we have the doc with the same name with null prefix in DC1? We know A_test_doc_0 is the same on both DCs, with revId 44. "\u0000A_test_doc_0" is a different document. For DC1, we can't find the null prefix document through UI. For DC2,  I tried and couldn't find it on through searching by doc ID on UI or through REST, even though it is listed in document list if you happen to use Firefox. There must be some something weird about null prefix document that UI/REST don't like. As far as XDCR is concerned, null prefixed documents are handled the same as any other documents. We don't modify doc Id. xdcrDiffer reports that there is no diff between the buckets.

        People

          hemant.rajput Hemant Rajput
          hemant.rajput Hemant Rajput
          Votes:
          0 Vote for this issue
          Watchers:
          5 Start watching this issue

          Dates

            Created:
            Updated:

            Gerrit Reviews

              There are no open Gerrit changes

              PagerDuty