Details
-
Bug
-
Resolution: Duplicate
-
Critical
-
Morpheus, 7.0.5
-
Untriaged
-
Yes
Description
Getting a 500 unexpected server error in multiple tests causing them to fail in 7.0.5-7637. The same tests (incl. py-xdcr-rebalance-4) pass in 7.0.5-7632.
Script to reproduce:
./testrunner -i /tmp/testexec.46436.ini -p stop-on-failure=False,fail_on_errors=1,get-cbcollect-info=False,java_sdk_client=True,get-cbcollect-info=True -t xdcr.rebalanceXDCR.Rebalance.swap_rebalance_out_master,items=10000,rdirection=unidirection,ctopology=chain,update=C1,delete=C1,rebalance=C2,GROUP=P1;xmem,stop-on-failure=False,fail_on_errors=1,get-cbcollect-info=False,java_sdk_client=True |
Cluster configuration:
[cluster1]
|
1:_1 |
2:_2 |
[cluster2]
|
1:_3 |
2:_4 |
[servers]
|
1:_1 |
2:_2 |
3:_3 |
4:_4 |
5:_5 |
6:_6 |
[_1]
|
ip:172.23.107.58 |
[_2]
|
ip:172.23.107.91 |
[_3]
|
ip:172.23.107.85 |
[_4]
|
ip:172.23.107.84 |
[_5]
|
ip:172.23.107.88 |
[_6]
|
ip:172.23.107.78 |
Steps to reproduce
- Adding remote node @172.23.107.91:8091 to this cluster @172.23.107.58:8091.
- Rebalance operation started and completed with 100% progress
- Adding user: cbadminbucket with roles: admin
- Delete User: Exception while deleting user. Exception is -b'"User was not found."'
- Repeated steps 4 and 5.
- Create bucket in http://172.23.107.58:8091/pools/default/buckets
- Create scope (default->scope_1)
- Create single collection (default->scope_1->collection_1)
- Repeat steps 7, 8 and 9 for http://172.23.107.85:8091/pools/default/buckets
- Add remote cluster hostname:172.23.107.85:8091 with username:password Administrator:password name:remote_cluster_C1-C2 to source node: 172.23.107.58:8091
- Started continuous replication type:xmem from default to default in the remote cluster remote_cluster_C1-C2, replication created with id: 60e83a74dd920b61de3d936f80f96921/default/default
- Updated checkpointInterval=60 on bucket 'default' on 172.23.107.58
- Started swap-rebalance [remove_node:172.23.107.85] -> [add_node:172.23.107.78] at C2 cluster 172.23.107.85
- Added remote node @172.23.107.78:8091 to this cluster @172.23.107.85:8091
- Rebalance operation started and completed with 100% progress
- error: 404 reason: unknown b'"unknown pool"' auth: Administrator:password
http://172.23.107.85:8091/pools/default with status False: unknown pool - Updating keys @ C1
java -jar java_sdk_client/collections/target/javaclient/javaclient.jar -i 172.23.107.58 -u Administrator -p password -b default -s default -c _default -n 1000 -pc 0 -pu 30 -pd 0 -l uniform -dsn 1 -dpx doc -dt Person -de 0 -ds 500 -ac True -st 0 -en 999 -o False -sd False - Deleting keys @ C1
2022-11-03 23:53:49 | INFO | MainProcess | Cluster_Thread | [task.execute_for_collection] java -jar java_sdk_client/collections/target/javaclient/javaclient.jar -i 172.23.107.58 -u Administrator -p password -b default -s default -c _default -n 1000 -pc 0 -pu 0 -pd 30 -l uniform -dsn 1 -dpx doc -dt Person -de 0 -ds 500 -ac True -st 0 -en 999 -o False -sd False - Merging keys for replication Replication C1:default -> C2:default (After merging: destination bucket's kv_store now has 0 valid keys and 0 deleted keys)
- Creating direct client 172.23.107.58:11210 default, and setting flush param on server and setting param: exp_pager_stime 10. Repeat for 172.23.107.91:11210 default, 172.23.107.78:11210 default, 172.23.107.84:11210 default. Wait for expiry pager to run on all of these nodes.
- Waiting for dcp queue to drain on cluster node: 172.23.107.58, then 172.23.107.84
- Local variable 'mutations' referenced before assignment
- error: 500 reason: unknown b'["Unexpected server error, request logged."]' auth: Administrator:password
- XDCRNewbasetests cleanup is started for test #1 swap_rebalance_out_master, removing xdcr/nodes settings
2022-11-03 23:57:19 | ERROR | MainProcess | test_thread | [rest_client._http_request] GET http://172.23.107.58:8091/pools/default/tasks body: headers: {'Content-Type': 'application/x-www-form-urlencoded', 'Authorization': 'Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA==', 'Accept': '*/*'} error: 500 reason: unknown b'["Unexpected server error, request logged."]' auth: Administrator:password |
Jenkins Job Link (xdcr-rebalance-4) (Test 1): http://qa.sc.couchbase.com/job/test_suite_executor/523769/
Error
2022-11-03 23:57:19 | ERROR | MainProcess | test_thread | [rest_client._http_request] GET
http://172.23.107.58:8091/pools/default/tasks
body: headers: {'Content-Type': 'application/x-www-form-urlencoded', 'Authorization': 'Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA==', 'Accept': '/'} error: 500 reason: unknown b'["Unexpected server error, request logged."]' auth: Administrator:password
Logs
s3://cb-customers-secure/mb-54424/2022-11-04/172.23.107.58-20221103-2358-diag.zip
s3://cb-customers-secure/mb-54424/2022-11-04/172.23.107.78-20221103-2358-diag.zip
s3://cb-customers-secure/mb-54424/2022-11-04/172.23.107.84-20221103-2358-diag.zip
s3://cb-customers-secure/mb-54424/2022-11-04/172.23.107.85-20221103-2358-diag.zip
s3://cb-customers-secure/mb-54424/2022-11-04/172.23.107.88-20221103-2358-diag.zip
s3://cb-customers-secure/mb-54424/2022-11-04/172.23.107.91-20221103-2358-diag.zip
http://supportal.couchbase.com/snapshot/6f7646a55ddabf5f081a701f1b55fbcc::0
Attachments
Issue Links
- duplicates
-
MB-54366 [BP-7.0.5] - XDCR Metakv callbacks racing when remote cluster ref is added/changed
- Resolved