Details
-
Bug
-
Resolution: Fixed
-
Critical
-
CBAS DP4
-
None
-
Untriaged
-
-
Yes
-
CX Sprint 74
Description
Build : 5.0.0-783 (also seen in build 5.0.0-766)
We have a test that runs 4096 queries in batches of 200 queries in async mode. This is the query - select sleep(count,500) from default_ds.
All queries run fine. But rebalance out of the analytics node (the only analytics node in the cluster fails).
The UI diag logs says -
Rebalance exited with reason {service_rebalance_failed,cbas, {rebalance_failed,
}}
The analytics.log.1.gz (attached) has lots of errors/warnings like these while the test was running:
2017-10-15T06:08:54.796-07:00 WARN CBAS.work.NotifyTaskFailureWork [Worker:d1e24228352dbf84f8b2bc277f72f35c] d1e24228352dbf84f8b2bc277f72f35c is sending a notification to cc that task TAID:TID:ANID:ODID:3:0:0:0 has failed |
org.apache.hyracks.api.exceptions.HyracksDataException: Index resource couldn't be found. Has it been created yet? Was it deleted?
|
at org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:134) ~[hyracks-api-1.0.0-cbas-dp3.jar:1.0.0-cbas-dp3] |
at org.apache.hyracks.control.common.utils.ExceptionUtils.setNodeIds(ExceptionUtils.java:63) ~[hyracks-control-common-1.0.0-cbas-dp3.jar:1.0.0-cbas-dp3] |
at org.apache.hyracks.control.nc.Task.run(Task.java:367) ~[hyracks-control-nc-1.0.0-cbas-dp3.jar:1.0.0-cbas-dp3] |
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_131] |
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_131] |
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_131] |
Attachments
For Gerrit Dashboard: MB-26400 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
84405,2 | MB-26400: guard access of NodeControllerState behind WorkQueue | master | asterix-opt | Status: MERGED | +2 | +1 |