Details
-
Bug
-
Resolution: Fixed
-
Major
-
7.1.1
-
Untriaged
-
1
-
Yes
Description
We saw recovery tests failed on build 7.1.1-3166. These runs failover a node, add the node back, and then start delta recovery.
2022-06-21T18:29:26 [INFO] Sleeping 1200 seconds before triggering failover
2022-06-21T18:49:26 [INFO] Failing over node: 172.23.99.206
2022-06-21T18:49:26 [INFO] Getting OTP node name from 172.23.99.206
2022-06-21T18:49:26 [INFO] Adding node back: 172.23.99.206
2022-06-21T18:49:26 [INFO] Getting OTP node name from 172.23.99.206
2022-06-21T18:49:26 [INFO] Enabling delta recovery: 172.23.99.206
After the runs start delta recovery, the runs hit the issue.
2022-06-21T19:13:07 [WARNING] {"deltaRecoveryNotPossible":1}
2022-06-21T19:13:07 [WARNING] Retrying http://172.23.99.203:8091/controller/rebalance
2022-06-21T19:13:17 [WARNING] {"deltaRecoveryNotPossible":1}
2022-06-21T19:13:17 [WARNING] Retrying http://172.23.99.203:8091/controller/rebalance
2022-06-21T19:13:27 [ERROR] Request http://172.23.99.203:8091/controller/rebalance failed after 20 attempts
Delta recovery after hard failover (min), 3 -> 4, 1 bucket x 100M x 2KB, 10K ops/sec
http://perf.jenkins.couchbase.com/job/hestia/7778/
Delta recovery after graceful failover (min), 3 -> 4, 1 bucket x 100M x 2KB, 10K ops/sec
http://perf.jenkins.couchbase.com/job/hestia/7777/
Attachments
For Gerrit Dashboard: MB-52640 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
176557,7 | MB-52640, MB-52639: Fix broken Delta Recovery | neo | ns_server | Status: MERGED | +2 | +1 |
176564,1 | Debug build for MB-52640 | neo | ns_server | Status: ABANDONED | 0 | 0 |
176619,1 | MB-52640, MB-52639: Fix broken Delta Recovery | neo | ns_server | Status: ABANDONED | 0 | 0 |
176658,1 | Merge remote-tracking branch 'couchbase/neo' | master | ns_server | Status: MERGED | +2 | +1 |