Details
Description
QE test:
+ ./testrunner -i INI_FILE.ini -p '
' -t xdcr.uniXDCR.unidirectional.load_with_ops,items=50000,standard_buckets=1,sasl_buckets=1,ctopology=chain,rdirection=unidirection,replication_type=xmem,doc-ops=update-delete
2015-03-01 09:27:38 | INFO | MainProcess | test_thread | [rest_client.stop_replication] Deleting replication controller/cancelXDCR/replicationSpec%2F1f734b3fe46766b17f505659de79ff3d%2Fsasl_bucket_1%2Fsasl_bucket_1
2015-03-01 09:27:39 | INFO | MainProcess | test_thread | [rest_client.diag_eval] /diag/eval status on 172.23.107.170:8091: True content: true command: ns_config:read_key_fast(goxdcr_enabled, false)
2015-03-01 09:27:39 | INFO | MainProcess | test_thread | [rest_client.stop_replication] Deleting replication controller/cancelXDCR/replicationSpec%2F1f734b3fe46766b17f505659de79ff3d%2Fstandard_bucket_1%2Fstandard_bucket_1
2015-03-01 09:27:39 | INFO | MainProcess | test_thread | [rest_client.remove_remote_cluster] removing remote cluster name:remote_cluster_C1-C2
2015-03-01 09:28:09 | ERROR | MainProcess | test_thread | [rest_client._http_request] http://172.23.107.170:8091/pools/default/remoteClusters/remote_cluster_C1-C2 error 500 reason: unknown ["Unexpected server error, request logged."]
2015-03-01 09:28:09 | ERROR | MainProcess | test_thread | [rest_client.remove_remote_cluster] failed to remove remote cluster: status:False,content:["Unexpected server error, request logged."]
ERROR
======================================================================
ERROR: load_with_ops (xdcr.uniXDCR.unidirectional)
----------------------------------------------------------------------
Traceback (most recent call last):
File "pytests/xdcr/uniXDCR.py", line 23, in tearDown
super(unidirectional, self).tearDown()
File "pytests/xdcr/xdcrnewbasetests.py", line 2256, in tearDown
cb_cluster.cleanup_cluster(self)
File "pytests/xdcr/xdcrnewbasetests.py", line 1132, in cleanup_cluster
self.__remove_all_remote_clusters()
File "pytests/xdcr/xdcrnewbasetests.py", line 1103, in __remove_all_remote_clusters
remote_cluster_ref.remove()
File "pytests/xdcr/xdcrnewbasetests.py", line 603, in remove
self.__name)
File "lib/membase/api/rest_client.py", line 868, in remove_remote_cluster
raise Exception("remoteCluster API 'remove cluster' failed")
Exception: remoteCluster API 'remove cluster' failed
----------------------------------------------------------------------
Ran 1 test in 524.494s
What happened is:
1. when replication is deleted, goxdcr get the notification from metakv via call back, goxdcr call metakv to delete checkpoint documents for that replication one by one.
2. the subsequent rest call is to remove the remote cluster reference, goxdcr calls metakv to remove the remote cluster reference synchronously, the task is queue after the tasks of deleting checkpoint documents which submitted before.
3. before metakv can finish removing remote cluster reference, the rest call is timed out.
We will need a way to handle the mass deletion more efficiently.
Please find the logs attached.
Attachments
For Gerrit Dashboard: MB-13683 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
48247,2 | MB-13683: Support for recursive deletion in metakv. | master | ns_server | Status: MERGED | +2 | +1 |
48250,2 | MB-13683: Support for recursive deletion in metakv. | master | cbauth | Status: MERGED | +2 | +1 |
48265,5 | MB-13683 metakv can't support batch delete efficiently - use RecursiveDelete API to delete checkpoint docs | master | goxdcr | Status: MERGED | +2 | +1 |
48351,2 | MB-13683:Fix shadow variable make warning. | master | ns_server | Status: MERGED | +2 | +1 |