Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
3.0-Beta
-
Security Level: Public
-
None
-
centOS 6.x
-
Untriaged
-
-
Unknown
-
June 30 - July 18
Description
I'm seeing a bug similar to MB-11573 on 991. 600 replica items haven't been deleted. However curr_items and vb_active_curr_items are correct.
2014-07-21 18:18:44 | INFO | MainProcess | Cluster_Thread | [task.check] Saw curr_items 2800 == 2800 expected on '172.23.106.47:8091''172.23.106.48:8091',default bucket
2014-07-21 18:18:45 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 172.23.106.47:11210 default
2014-07-21 18:18:45 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 172.23.106.48:11210 default
2014-07-21 18:18:45 | INFO | MainProcess | Cluster_Thread | [task.check] Saw vb_active_curr_items 2800 == 2800 expected on '172.23.106.47:8091''172.23.106.48:8091',default bucket
2014-07-21 18:18:45 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 172.23.106.47:11210 default
2014-07-21 18:18:45 | INFO | MainProcess | Cluster_Thread | [data_helper.direct_client] creating direct client 172.23.106.48:11210 default
2014-07-21 18:18:45 | WARNING | MainProcess | Cluster_Thread | [task.check] Not Ready: vb_replica_curr_items 3400 == 2800 expected on '172.23.106.47:8091''172.23.106.48:8091', default bucket
2014-07-21 18:18:48 | WARNING | MainProcess | Cluster_Thread | [task.check] Not Ready: vb_replica_curr_items 3400 == 2800 expected on '172.23.106.47:8091''172.23.106.48:8091', sasl_bucket_1 bucket
2014-07-21 18:18:49 | WARNING | MainProcess | Cluster_Thread | [task.check] Not Ready: vb_replica_curr_items 3400 == 2800 expected on '172.23.106.47:8091''172.23.106.48:8091', standard_bucket_1 bucket
testcase:
./testrunner -i sanity.ini -t xdcr.pauseResumeXDCR.PauseResumeTest.replication_with_pause_and_resume,reboot=dest_node,items=2000,rdirection=bidirection,replication_type=xmem,standard_buckets=1,sasl_buckets=1,pause=source-destination,doc-ops=update-delete,doc-ops-dest=update-delete
What the test does:
3nodes * 3nodes, bi-dir xdcr on 3 buckets
1. Load 2k items on both clusters. Pause all xdcr(all items got replicated by this time)
2. Reboot one dest node (.48)
3. After warmup, resume replication on all buckets, on both clusters
4. 30% Update, 30% delete items on both sides. No expiration set.
5. Verify item count , value and rev-ids.
The cluster is available for debugging until tomorrow morning. Thanks.