Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-13797

GoXDCR: Replication stuck, updates/deletes not replicated (after topology change)

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Blocker
    • 4.0.0
    • 4.0.0
    • XDCR
    • Security Level: Public
    • CentOS

    Description

      Build
      -------
      3.5.0-1415

      Testcase

      ./testrunner -i INI_FILE.ini get-cbcollect-info=True,get-logs=False,stop-on-failure=False,replication_type=xmem,enable_goxdcr=True,demand_encryption=1,GROUP=ALL -t xdcr.rebalanceXDCR.Rebalance.async_rebalance_in,items=100000,rdirection=unidirection,ctopology=chain,update=C1,delete=C1,rebalance=C1-C2,num_rebalance=1,GROUP=P1

      1. C1[.45,.46] -->C2[.47,.48] encrypted xdcr on default bucket
      2. Add one node on both C1 and C2, rebalance
      3. Perform updates and deletes on C1.

      C2:

      [2015-03-05 13:20:53,220] - [task:487] WARNING - Not Ready: vb_active_curr_items 79993 == 70000 expected on '172.23.106.47:8091''172.23.106.48:8091''172.23.106.209:8091', default bucket
      [2015-03-05 13:20:58,294] - [task:487] WARNING - Not Ready: vb_active_curr_items 79993 == 70000 expected on '172.23.106.47:8091''172.23.106.48:8091''172.23.106.209:8091', default bucket

      Also missing updates on C2-

      [2015-03-05 13:21:10,596] - [task:1361] ERROR - ===== Verifying rev_ids failed for key: C1-key-26412 =====
      [2015-03-05 13:21:10,596] - [task:1362] ERROR - seqno mismatch: Source seqno:2, Destination seqno:1, Error Count:1
      [2015-03-05 13:21:10,597] - [task:1362] ERROR - cas mismatch: Source cas:1425589721679396864, Destination cas:1425589162784980992, Error Count:2
      [2015-03-05 13:21:10,597] - [task:1363] ERROR - Source meta data:

      {'deleted': 0, 'seqno': 2, 'cas': 1425589721679396864, 'flags': 0, 'expiration': 0}

      [2015-03-05 13:21:10,598] - [task:1364] ERROR - Dest meta data:

      {'deleted': 0, 'seqno': 1, 'cas': 1425589162784980992, 'flags': 0, 'expiration': 0}

      [2015-03-05 13:21:10,608] - [task:1361] ERROR - ===== Verifying rev_ids failed for key: C1-key-23286 =====
      [2015-03-05 13:21:10,608] - [task:1362] ERROR - seqno mismatch: Source seqno:2, Destination seqno:1, Error Count:3
      [2015-03-05 13:21:10,609] - [task:1362] ERROR - cas mismatch: Source cas:1425589715759071232, Destination cas:1425589161183543296, Error Count:4
      [2015-03-05 13:21:10,610] - [task:1363] ERROR - Source meta data:

      {'deleted': 0, 'seqno': 2, 'cas': 1425589715759071232, 'flags': 0, 'expiration': 0}

      [2015-03-05 13:21:10,610] - [task:1364] ERROR - Dest meta data:

      {'deleted': 0, 'seqno': 1, 'cas': 1425589161183543296, 'flags': 0, 'expiration': 0}

      [2015-03-05 13:21:10,620] - [task:1361] ERROR - ===== Verifying rev_ids failed for key: C1-key-20484 =====
      [2015-03-05 13:21:10,620] - [task:1362] ERROR - seqno mismatch: Source seqno:2, Destination seqno:1, Error Count:5
      [2015-03-05 13:21:10,620] - [task:1362] ERROR - cas mismatch: Source cas:1425589710890663936, Destination cas:1425589159591018496, Error Count:6
      [2015-03-05 13:21:10,621] - [task:1363] ERROR - Source meta data:

      {'deleted': 0, 'seqno': 2, 'cas': 1425589710890663936, 'flags': 0, 'expiration': 0}

      [2015-03-05 13:21:10,622] - [task:1364] ERROR - Dest meta data:

      {'deleted': 0, 'seqno': 1, 'cas': 1425589159591018496, 'flags': 0, 'expiration': 0}

      [2015-03-05 13:21:10,626] - [task:1361] ERROR - ===== Verifying rev_ids failed for key: C1-key-28192 =====
      [2015-03-05 13:21:10,627] - [task:1362] ERROR - seqno mismatch: Source seqno:2, Destination seqno:1, Error Count:7
      [2015-03-05 13:21:10,627] - [task:1362] ERROR - cas mismatch: Source cas:1425589725057318912, Destination cas:1425589163947261952, Error Count:8
      [2015-03-05 13:21:10,628] - [task:1363] ERROR - Source meta data:

      {'deleted': 0, 'seqno': 2, 'cas': 1425589725057318912, 'flags': 0, 'expiration': 0}

      [2015-03-05 13:21:10,628] - [task:1364] ERROR - Dest meta data:

      {'deleted': 0, 'seqno': 1, 'cas': 1425589163947261952, 'flags': 0, 'expiration': 0}

      [2015-03-05 13:21:10,637] - [task:1361] ERROR - ===== Verifying rev_ids failed for key: C1-key-28106 =====
      [2015-03-05 13:21:10,637] - [task:1362] ERROR - seqno mismatch: Source seqno:2, Destination seqno:1, Error Count:9
      [2015-03-05 13:21:10,638] - [task:1362] ERROR - cas mismatch: Source cas:1425589724893216768, Destination cas:1425589163870060544, Error Count:10
      [2015-03-05 13:21:10,639] - [task:1363] ERROR - Source meta data:

      {'deleted': 0, 'seqno': 2, 'cas': 1425589724893216768, 'flags': 0, 'expiration': 0}

      [2015-03-05 13:21:10,640] - [task:1364] ERROR - Dest meta data:

      {'deleted': 0, 'seqno': 1, 'cas': 1425589163870060544, 'flags': 0, 'expiration': 0}

      [2015-03-05 13:21:10,649] - [task:1361] ERROR - ===== Verifying rev_ids failed for key: C1-key-5329 =====
      [2015-03-05 13:21:10,650] - [task:1362] ERROR - seqno mismatch: Source seqno:2, Destination seqno:1, Error Count:11
      [2015-03-05 13:21:10,650] - [task:1362] ERROR - cas mismatch: Source cas:1425589684802355200, Destination cas:1425589151786205184, Error Count:12
      [2015-03-05 13:21:10,651] - [task:1363] ERROR - Source meta data:

      {'deleted': 0, 'seqno': 2, 'cas': 1425589684802355200, 'flags': 0, 'expiration': 0}

      [2015-03-05 13:21:10,652] - [task:1364] ERROR - Dest meta data:

      {'deleted': 0, 'seqno': 1, 'cas': 1425589151786205184, 'flags': 0, 'expiration': 0}

      [2015-03-05 13:21:10,656] - [task:1361] ERROR - ===== Verifying rev_ids failed for key: C1-key-24379 =====
      [2015-03-05 13:21:10,656] - [task:1362] ERROR - seqno mismatch: Source seqno:2, Destination seqno:1, Error Count:13
      [2015-03-05 13:21:10,657] - [task:1362] ERROR - cas mismatch: Source cas:1425589717781839872, Destination cas:1425589161780248576, Error Count:14
      [2015-03-05 13:21:10,657] - [task:1363] ERROR - Source meta data:

      {'deleted': 0, 'seqno': 2, 'cas': 1425589717781839872, 'flags': 0, 'expiration': 0}

      [2015-03-05 13:21:10,658] - [task:1364] ERROR - Dest meta data:

      {'deleted': 0, 'seqno': 1, 'cas': 1425589161780248576, 'flags': 0, 'expiration': 0}

      [2015-03-05 13:21:10,664] - [task:1361] ERROR - ===== Verifying rev_ids failed for key: C1-key-16129 =====
      [2015-03-05 13:21:10,664] - [task:1362] ERROR - seqno mismatch: Source seqno:2, Destination seqno:1, Error Count:15
      [2015-03-05 13:21:10,665] - [task:1362] ERROR - cas mismatch: Source cas:1425589703041155072, Destination cas:1425589157438291968, Error Count:16
      [2015-03-05 13:21:10,665] - [task:1363] ERROR - Source meta data:

      {'deleted': 0, 'seqno': 2, 'cas': 1425589703041155072, 'flags': 0, 'expiration': 0}

      [2015-03-05 13:21:10,666] - [task:1364] ERROR - Dest meta data:

      {'deleted': 0, 'seqno': 1, 'cas': 1425589157438291968, 'flags': 0, 'expiration': 0}

      On .45, I see -

      PipelineManager13:29:04.769265 [INFO] Replication Status = map[replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default:name=

      {replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default}

      , status=

      {Replicating}

      , errors={[

      {"time":"2015-03-05T13:10:05.139728617-08:00","errMsg":"map[xmem_replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default_172.23.106.47:11210_1:Xmem is stuck]"}

      ,

      {"time":"2015-03-05T13:07:45.010482653-08:00","errMsg":"Failed to get starting seqno for pipeline replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default-731100359"}

      ,

      {"time":"2015-03-05T13:07:31.238884089-08:00","errMsg":"Failed to get starting seqno for pipeline replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default-222093608"}

      ,

      {"time":"2015-03-05T13:07:16.865638393-08:00","errMsg":"Failed to get starting seqno for pipeline replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default-185687812"}

      ,

      {"time":"2015-03-05T13:07:03.945582546-08:00","errMsg":"Failed to get starting seqno for pipeline replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default-641087729"}

      ,

      {"time":"2015-03-05T13:06:51.318452939-08:00","errMsg":"Failed to get starting seqno for pipeline replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default-181683070"}

      ,

      {"time":"2015-03-05T13:06:38.910331539-08:00","errMsg":"Failed to get starting seqno for pipeline replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default-287027880"}

      ,

      {"time":"2015-03-05T13:06:25.000359778-08:00","errMsg":"Failed to get starting seqno for pipeline replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default-120331902"}

      ,

      {"time":"2015-03-05T13:06:11.539383286-08:00","errMsg":"Failed to get starting seqno for pipeline replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default-821133243"}

      ,

      {"time":"2015-03-05T13:05:57.860348114-08:00","errMsg":"Failed to get starting seqno for pipeline replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default-422572967"}

      ,

      {"time":"2015-03-05T13:05:41.089453921-08:00","errMsg":"Pipeline replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default failed to start, err=map[dcp_replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default_172.23.106.45:11210_1:Can't move to state 2 - dcp_replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default_172.23.106.45:11210_1's current state is Error, can only move to state [Stopping]]\n"}

      ,

      {"time":"2015-03-05T13:04:24.718275584-08:00","errMsg":"map[dcp_replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default_172.23.106.45:11210_1:NOT_MY_VBUCKET]"}

      ,

      {"time":"2015-03-05T13:03:14.274606443-08:00","errMsg":"map[dcp_replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default_172.23.106.45:11210_1:NOT_MY_VBUCKET]"}

      ,

      {"time":"2015-03-05T13:02:37.037838062-08:00","errMsg":"map[dcp_replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default_172.23.106.45:11210_1:NOT_MY_VBUCKET]"}

      ,

      {"time":"2015-03-05T13:02:19.801753005-08:00","errMsg":"Pipeline replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default failed to start"}

      ,

      {"time":"2015-03-05T13:02:01.327778412-08:00","errMsg":"Pipeline replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default failed to start"}

      ,

      {"time":"2015-03-05T13:01:15.290117389-08:00","errMsg":"Pipeline replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default failed to start"}

      ,

      {"time":"2015-03-05T13:00:28.447422651-08:00","errMsg":"map[dcp_replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default_172.23.106.45:11210_1:dcp stream for vb=511 is closed by producer]"}

      ]}, progress=

      {Received error report : map[xmem_replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default_172.23.106.47:11210_1:Xmem is stuck xmem_replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default_172.23.106.47:11210_0:Xmem is stuck dcp_replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default_172.23.106.45:11210_0:dcp stream for vb=165 is closed by producer dcp_replicationSpec/87f047fbbdaa6dd1f0d5abcbe152b146/default/default_172.23.106.45:11210_1:dcp stream for vb=791 is closed by producer], declare pipeline broken}

      ]

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              apiravi Aruna Piravi (Inactive)
              apiravi Aruna Piravi (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty