Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-11234

UPR : Some replica expired items are stuck in UPR queue

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Test Blocker
    • 3.0
    • 3.0
    • couchbase-bucket
    • Security Level: Public
    • None
    • MacOSX cluster run- latest code, CentOS VMs - builds 741, 744
    • Untriaged
    • Unknown
    • June 30 - July 18

    Description

      Build
      --------
      was found on 3.0.0-741 when internal repl = upr by default.

      Reproducible?
      ---------------------
      Yes, consistently with cluster run on Mac; not on TAP, only on UPR.

      Scenario
      --------------
      1. Bi-XDCR between 3 buckets
      2. Load 20K items on both sides. Pause XDCR on both sides during loading.
      3. Resume XDCR. Expect 40K on all buckets on both sides after bi-xdcr.
      4. Update 30% of cluster keys(on each cluster - non-overlapping keys) with expiration time of 20s
      5. Delete 30% of cluster keys(non-overlapping keys)
      6. Run expiry pager after 20s. Verify item count and do other validations.

      Testcase
      --------------

      ./testrunner -i bixdcr.ini -t xdcr.pauseResumeXDCR.PauseResumeTest.replication_with_pause_and_resume,items=20000,rdirection=bidirection,ctopology=chain,standard_buckets=1,expires=20,pause=source-destination,doc-ops=update-delete,doc-ops-dest=update-delete

      Same as MB-11104 which was fixed recently for TAP. After the fix, curr_items, vb_active_curr_items and vb_replica_curr_items are consistent with expected values for TAP.
      However with UPR as internal replication protocol, curr_items, vb_active_curr_items are correct but vb_replica_curr_items contains more items than expected as seen below.

      Same build, UPR vs TAP
      ---------------------------------------

      TAP:

      2014-05-28 12:21:57 | INFO | MainProcess | Cluster_Thread | [task.check] Saw curr_items 16000 == 16000 expected on '127.0.0.1:9002''127.0.0.1:9003',sasl_bucket_1 bucket
      2014-05-28 12:21:57 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12004 sasl_bucket_1
      2014-05-28 12:21:57 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12006 sasl_bucket_1
      2014-05-28 12:21:57 | INFO | MainProcess | Cluster_Thread | [task.check] Saw vb_active_curr_items 16000 == 16000 expected on '127.0.0.1:9002''127.0.0.1:9003',sasl_bucket_1 bucket
      2014-05-28 12:21:58 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12004 sasl_bucket_1
      2014-05-28 12:21:58 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12006 sasl_bucket_1
      2014-05-28 12:21:58 | INFO | MainProcess | Cluster_Thread | [task.check] Saw vb_replica_curr_items 16000 == 16000 expected on '127.0.0.1:9002''127.0.0.1:9003',sasl_bucket_1 bucket
      2014-05-28 12:21:58 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12004 standard_bucket_1
      2014-05-28 12:21:58 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12006 standard_bucket_1
      2014-05-28 12:21:58 | INFO | MainProcess | Cluster_Thread | [task.check] Saw curr_items 16000 == 16000 expected on '127.0.0.1:9002''127.0.0.1:9003',standard_bucket_1 bucket
      2014-05-28 12:21:58 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12004 standard_bucket_1
      2014-05-28 12:21:58 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12006 standard_bucket_1
      2014-05-28 12:21:58 | INFO | MainProcess | Cluster_Thread | [task.check] Saw vb_active_curr_items 16000 == 16000 expected on '127.0.0.1:9002''127.0.0.1:9003',standard_bucket_1 bucket
      2014-05-28 12:21:58 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12004 standard_bucket_1
      2014-05-28 12:21:58 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12006 standard_bucket_1
      2014-05-28 12:21:58 | INFO | MainProcess | Cluster_Thread | [task.check] Saw vb_replica_curr_items 16000 == 16000 expected on '127.0.0.1:9002''127.0.0.1:9003',standard_bucket_1 bucket
      2014-05-28 12:21:58 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12004 default
      2014-05-28 12:21:58 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12006 default
      2014-05-28 12:21:58 | INFO | MainProcess | Cluster_Thread | [task.check] Saw curr_items 16000 == 16000 expected on '127.0.0.1:9002''127.0.0.1:9003',default bucket
      2014-05-28 12:21:58 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12004 default
      2014-05-28 12:21:58 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12006 default
      2014-05-28 12:21:58 | INFO | MainProcess | Cluster_Thread | [task.check] Saw vb_active_curr_items 16000 == 16000 expected on '127.0.0.1:9002''127.0.0.1:9003',default bucket
      2014-05-28 12:21:58 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12004 default
      2014-05-28 12:21:59 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12006 default
      2014-05-28 12:21:59 | INFO | MainProcess | Cluster_Thread | [task.check] Saw vb_replica_curr_items 16000 == 16000 expected on '127.0.0.1:9002''127.0.0.1:9003',default bucket

      UPR:

      2014-05-28 12:47:23 | INFO | MainProcess | Cluster_Thread | [task.check] Saw curr_items 16000 == 16000 expected on '127.0.0.1:9000''127.0.0.1:9001',sasl_bucket_1 bucket
      2014-05-28 12:47:23 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12000 sasl_bucket_1
      2014-05-28 12:47:23 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12002 sasl_bucket_1
      2014-05-28 12:47:23 | INFO | MainProcess | Cluster_Thread | [task.check] Saw vb_active_curr_items 16000 == 16000 expected on '127.0.0.1:9000''127.0.0.1:9001',sasl_bucket_1 bucket
      2014-05-28 12:47:23 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12000 sasl_bucket_1
      2014-05-28 12:47:23 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12002 sasl_bucket_1
      2014-05-28 12:47:23 | WARNING | MainProcess | Cluster_Thread | [task.check] Not Ready: vb_replica_curr_items 16192 == 16000 expected on '127.0.0.1:9000''127.0.0.1:9001', sasl_bucket_1 bucket
      2014-05-28 12:47:23 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12000 standard_bucket_1
      2014-05-28 12:47:23 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12002 standard_bucket_1
      2014-05-28 12:47:23 | INFO | MainProcess | Cluster_Thread | [task.check] Saw curr_items 16000 == 16000 expected on '127.0.0.1:9000''127.0.0.1:9001',standard_bucket_1 bucket
      2014-05-28 12:47:23 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12000 standard_bucket_1
      2014-05-28 12:47:23 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12002 standard_bucket_1
      2014-05-28 12:47:23 | INFO | MainProcess | Cluster_Thread | [task.check] Saw vb_active_curr_items 16000 == 16000 expected on '127.0.0.1:9000''127.0.0.1:9001',standard_bucket_1 bucket
      2014-05-28 12:47:23 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12000 standard_bucket_1
      2014-05-28 12:47:23 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12002 standard_bucket_1
      2014-05-28 12:47:23 | WARNING | MainProcess | Cluster_Thread | [task.check] Not Ready: vb_replica_curr_items 16139 == 16000 expected on '127.0.0.1:9000''127.0.0.1:9001', standard_bucket_1 bucket
      2014-05-28 12:47:23 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12000 default
      2014-05-28 12:47:23 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12002 default
      2014-05-28 12:47:24 | INFO | MainProcess | Cluster_Thread | [task.check] Saw curr_items 16000 == 16000 expected on '127.0.0.1:9000''127.0.0.1:9001',default bucket
      2014-05-28 12:47:24 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12000 default
      2014-05-28 12:47:24 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12002 default
      2014-05-28 12:47:24 | INFO | MainProcess | Cluster_Thread | [task.check] Saw vb_active_curr_items 16000 == 16000 expected on '127.0.0.1:9000''127.0.0.1:9001',default bucket
      2014-05-28 12:47:24 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12000 default
      2014-05-28 12:47:24 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 127.0.0.1:12002 default
      2014-05-28 12:47:24 | WARNING | MainProcess | Cluster_Thread | [task.check] Not Ready: vb_replica_curr_items 16205 == 16000 expected on '127.0.0.1:9000''127.0.0.1:9001', default bucket
      2014-05-28 12:47:28 | WARNING | MainProcess | Cluster_Thread | [task.check] Not Ready: vb_replica_curr_items 16192 == 16000 expected on '127.0.0.1:9000''127.0.0.1:9001', sasl_bucket_1 bucket
      2014-05-28 12:47:29 | WARNING | MainProcess | Cluster_Thread | [task.check] Not Ready: vb_replica_curr_items 16139 == 16000 expected on '127.0.0.1:9000''127.0.0.1:9001', standard_bucket_1 bucket

      Important note
      ----------------------
      Pls note that the extra 205 replica items in bucket default have been spotted in the newly added UPR queue stats - ep_upr_replica_items_remaining. This is expected to be 0. Pls see screenshots.
      Same applies to other buckets.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            abhinav Abhi Dangeti
            apiravi Aruna Piravi (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty