Details
-
Bug
-
Resolution: Duplicate
-
Critical
-
Columnar 1.0.1
-
None
-
Sandbox
Build- 1.0.1-2313
-
0
-
Yes
-
Analytics Sprint 48
Description
Rebalance Failure was observed on the logs in the system tests.
Rebalance exited with reason {{badmatch,failed}, [{ns_rebalancer,rebalance_body,7, [{file,"src/ns_rebalancer.erl"}, {line,500}]}, {async,'-async_init/4-fun-1-',3, [{file,"src/async.erl"},{line,199}]}]}. Rebalance Operation Id = eaf14e837e09b9fa8cbfa48e1123885b |
Analytics Service unable to successfully rebalance 966618e9ff7a1c2ebd53bda934575b9a due to 'java.lang.IllegalStateException: timed out waiting for keep nodes to join & have partitions fully active (missing nodes: [svc-da-node-002.er65w5qpwe3wffcy.sandbox.nonprod-project-avengers.com:8091 (00e8c0a07f2a167fd9505b0a135d806c), svc-da-node-004.er65w5qpwe3wffcy.sandbox.nonprod-project-avengers.com:8091 (6cfc42f243b1fd766abff7b13ab6e752)]), metadata node active: true'; see analytics_info.log for details |
System was trying to failover a node:
2024-08-19T22:52:03.416+00:00 INFO CBAS.rebalance.Rebalance [HttpExecutor(port:9111)-3] keep nodes: [svc-da-node-001.er65w5qpwe3wffcy.sandbox.nonprod-project-avengers.com:8091 (a01a5c04a60206bfdffe9e7cf4b3a43f), svc-da-node-002.er65w5qpwe3wffcy.sandbox.nonprod-project-avengers.com:8091 (00e8c0a07f2a167fd9505b0a135d806c), svc-da-node-004.er65w5qpwe3wffcy.sandbox.nonprod-project-avengers.com:8091 (6cfc42f243b1fd766abff7b13ab6e752)], pending ejects: [], failedOver: [null (f066f1c77e314045e23015c5831a0263)] |
2024-08-19T22:52:03.419+00:00 INFO CBAS.rebalance.Rebalance [Rebalancer (3ca6a1a696212ac32be2bd76d4706cb7)] Failing over the following nodes: [null (f066f1c77e314045e23015c5831a0263)] |
2024-08-19T22:52:03.419+00:00 INFO CBAS.cluster.NodeManager [Rebalancer (3ca6a1a696212ac32be2bd76d4706cb7)] f066f1c77e314045e23015c5831a0263 considered dead |
2024-08-19T22:52:03.434+00:00 ERRO CBAS.executor.JobExecutor [Rebalancer (3ca6a1a696212ac32be2bd76d4706cb7)] Unexpected failure. Aborting job JID:0.1926 |
org.apache.hyracks.api.exceptions.HyracksException: HYR0010: Node f066f1c77e314045e23015c5831a0263 does not exist
|
at org.apache.hyracks.api.exceptions.HyracksException.create(HyracksException.java:58) ~[hyracks-api.jar:1.0.1-2313] |
at org.apache.hyracks.control.cc.executor.JobExecutor.assignLocation(JobExecutor.java:473) ~[hyracks-control-cc.jar:1.0.1-2313] |
at org.apache.hyracks.control.cc.executor.JobExecutor.assignTaskLocations(JobExecutor.java:365) ~[hyracks-control-cc.jar:1.0.1-2313] |
at org.apache.hyracks.control.cc.executor.JobExecutor.startRunnableTaskClusters(JobExecutor.java:245) ~[hyracks-control-cc.jar:1.0.1-2313] |
at org.apache.hyracks.control.cc.executor.JobExecutor.startRunnableActivityClusters(JobExecutor.java:209) ~[hyracks-control-cc.jar:1.0.1-2313] |
at org.apache.hyracks.control.cc.executor.JobExecutor.notifyNodeFailures(JobExecutor.java:733) ~[hyracks-control-cc.jar:1.0.1-2313] |
at org.apache.hyracks.control.cc.cluster.NodeManager.failNode(NodeManager.java:204) ~[hyracks-control-cc.jar:1.0.1-2313] |
at com.couchbase.analytics.control.rebalance.Rebalance.beforeLock(Rebalance.java:196) ~[columnar-server.jar:1.0.1-2313] |
at com.couchbase.analytics.control.rebalance.Rebalance.lambda$start$11(Rebalance.java:541) ~[columnar-server.jar:1.0.1-2313] |
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] |
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?] |
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?] |
at java.base/java.lang.Thread.run(Thread.java:840) [?:?] |
Logs:
https://cb-engineering.s3.amazonaws.com/SysTestCapella/collectinfo-2024-08-20T004620-ns_1%40svc-da-node-001.er65w5qpwe3wffcy.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestCapella/collectinfo-2024-08-20T004620-ns_1%40svc-da-node-002.er65w5qpwe3wffcy.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestCapella/collectinfo-2024-08-20T004620-ns_1%40svc-da-node-003.er65w5qpwe3wffcy.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestCapella/collectinfo-2024-08-20T004620-ns_1%40svc-da-node-004.er65w5qpwe3wffcy.sandbox.nonprod-project-avengers.com.zip
Attachments
Issue Links
- duplicates
-
MB-63217 [System Test] HYR0087: Unequal number of trees and filters found in /var/cb-
- Resolved