Details
-
Bug
-
Resolution: Duplicate
-
Test Blocker
-
6.0.0
-
centos cluster
-
Untriaged
-
Centos 64-bit
-
-
Unknown
Description
Build: 6.0.0-1480
Rebalance failed when we remove analytics nodeĀ
[2018-08-10T00:41:36-07:00, sequoiatools/couchbase-cli:facbde] rebalance -c 172.23.104.16:8091 --server-remove 172.23.104.23 -u Administrator -p password
|
Debug log
[user:error,2018-08-10T00:41:52.263-07:00,ns_1@172.23.104.16:<0.6509.0>:ns_orchestrator:do_log_rebalance_completion:1117]Rebalance exited with reason {service_rebalance_failed,cbas,
|
{rebalance_failed,
|
{service_error,
|
<<"Rebalance 8e784bec9eff91cbfc5a5e51a50a734c failed: CBAS0001: Datasets in different partitions have different DCP states. Mutations needed to catch up = 35639. User action: Try again later">>}}}
|
error logs
2018-08-10T00:41:48.140-07:00 ERRO CBAS.executor.JobExecutor [Worker:ClusterController] Unexpected failure. Aborting job JID:0.7203
|
org.apache.hyracks.api.exceptions.HyracksException: HYR0010: Node 3223dd365e3fc1d01376ed0269e54dc4 does not exist
|
at org.apache.hyracks.api.exceptions.HyracksException.create(HyracksException.java:56) ~[hyracks-api.jar:6.0.0-1480]
|
at org.apache.hyracks.control.cc.executor.JobExecutor.assignLocation(JobExecutor.java:473) ~[hyracks-control-cc.jar:6.0.0-1480]
|
at org.apache.hyracks.control.cc.executor.JobExecutor.assignTaskLocations(JobExecutor.java:365) ~[hyracks-control-cc.jar:6.0.0-1480]
|
at org.apache.hyracks.control.cc.executor.JobExecutor.startRunnableTaskClusters(JobExecutor.java:245) ~[hyracks-control-cc.jar:6.0.0-1480]
|
at org.apache.hyracks.control.cc.executor.JobExecutor.startRunnableActivityClusters(JobExecutor.java:209) ~[hyracks-control-cc.jar:6.0.0-1480]
|
at org.apache.hyracks.control.cc.executor.JobExecutor.notifyNodeFailures(JobExecutor.java:732) [hyracks-control-cc.jar:6.0.0-1480]
|
at org.apache.hyracks.control.cc.cluster.NodeManager.failNode(NodeManager.java:197) [hyracks-control-cc.jar:6.0.0-1480]
|
at org.apache.hyracks.control.cc.cluster.NodeManager.addNode(NodeManager.java:110) [hyracks-control-cc.jar:6.0.0-1480]
|
at org.apache.hyracks.control.cc.work.RegisterNodeWork.doRun(RegisterNodeWork.java:58) [hyracks-control-cc.jar:6.0.0-1480]
|
at org.apache.hyracks.control.common.work.SynchronizableWork.run(SynchronizableWork.java:43) [hyracks-control-common.jar:6.0.0-1480]
|
at org.apache.hyracks.control.common.work.WorkQueue$WorkerThread.run(WorkQueue.java:127) [hyracks-control-common.jar:6.0.0-1480]
|
2018-08-10T00:41:48.141-07:00 ERRO CBAS.executor.JobExecutor [Worker:ClusterController] Unexpected failure. Aborting job JID:0.7204
|
org.apache.hyracks.api.exceptions.HyracksException: HYR0010: Node 3223dd365e3fc1d01376ed0269e54dc4 does not exist
|
at org.apache.hyracks.api.exceptions.HyracksException.create(HyracksException.java:56) ~[hyracks-api.jar:6.0.0-1480]
|
/8e784bec9eff91cbfc5a5e51a50a734c
|
at org.apache.hyracks.control.cc.executor.JobExecutor.startRunnableTaskClusters(JobExecutor.java:245) ~[hyracks-control-cc.jar:6.0.0-1480]
|
at org.apache.hyracks.control.cc.executor.JobExecutor.startRunnableActivityClusters(JobExecutor.java:209) ~[hyracks-control-cc.jar:6.0.0-1480]
|
at org.apache.hyracks.control.cc.executor.JobExecutor.notifyNodeFailures(JobExecutor.java:732) [hyracks-control-cc.jar:6.0.0-1480]
|
at org.apache.hyracks.control.cc.cluster.NodeManager.failNode(NodeManager.java:197) [hyracks-control-cc.jar:6.0.0-1480]
|
at org.apache.hyracks.control.cc.cluster.NodeManager.addNode(NodeManager.java:110) [hyracks-control-cc.jar:6.0.0-1480]
|
at org.apache.hyracks.control.cc.work.RegisterNodeWork.doRun(RegisterNodeWork.java:58) [hyracks-control-cc.jar:6.0.0-1480]
|
at org.apache.hyracks.control.common.work.SynchronizableWork.run(SynchronizableWork.java:43) [hyracks-control-common.jar:6.0.0-1480]
|
at org.apache.hyracks.control.common.work.WorkQueue$WorkerThread.run(WorkQueue.java:127) [hyracks-control-common.jar:6.0.0-1480]
|
2018-08-10T00:41:50.200-07:00 ERRO CBAS.active.ActiveEntityEventsListener [ActiveNotificationHandler] Active Job JID:0.7074 failed
|
org.apache.hyracks.api.exceptions.HyracksDataException: HYR0115: Local network error
|
at org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:60) ~[hyracks-api.jar:6.0.0-1480]
|
at org.apache.hyracks.dataflow.std.collectors.NonDeterministicChannelReader.findNextSender(NonDeterministicChannelReader.java:115) ~[hyracks-dataflow-std.jar:6.0.0-1480]
|
at org.apache.hyracks.dataflow.std.collectors.NonDeterministicFrameReader.nextFrame(NonDeterministicFrameReader.java:43) ~[hyracks-dataflow-std.jar:6.0.0-1480]
|
at org.apache.hyracks.control.nc.Task.pushFrames(Task.java:391) ~[hyracks-control-nc.jar:6.0.0-1480]
|
at org.apache.hyracks.control.nc.Task.run(Task.java:330) ~[hyracks-control-nc.jar:6.0.0-1480]
|
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_181]
|
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_181]
|
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
|
2018-08-10T00:41:51.924-07:00 ERRO CBAS.rebalance.Rebalance [Executor-99:ClusterController] rebalance failed
|
com.couchbase.analytics.common.exceptions.AnalyticsHyracksException: CBAS0001: Datasets in different partitions have different DCP states. Mutations needed to catch up = 35639. User action: Try again later
|
at com.couchbase.analytics.control.rebalance.ShadowStateWriteCallback.beforeRebalance(ShadowStateWriteCallback.java:80) ~[cbas-server.jar:6.0.0-1480]
|
at org.apache.asterix.utils.RebalanceUtil.rebalance(RebalanceUtil.java:220) ~[asterix-app.jar:6.0.0-1480]
|
at org.apache.asterix.utils.RebalanceUtil.rebalance(RebalanceUtil.java:131) ~[asterix-app.jar:6.0.0-1480]
|
at com.couchbase.analytics.control.rebalance.Rebalance.rebalanceDataset(Rebalance.java:426) ~[cbas-server.jar:6.0.0-1480]
|
at com.couchbase.analytics.control.rebalance.Rebalance.rebalanceDatasets(Rebalance.java:251) ~[cbas-server.jar:6.0.0-1480]
|
at com.couchbase.analytics.control.rebalance.Rebalance.lambda$doRebalance$3(Rebalance.java:183) ~[cbas-server.jar:6.0.0-1480]
|
at org.apache.hyracks.api.util.InvokeUtil.tryWithCleanups(InvokeUtil.java:191) ~[hyracks-api.jar:6.0.0-1480]
|
at com.couchbase.analytics.control.rebalance.Rebalance.doRebalance(Rebalance.java:179) ~[cbas-server.jar:6.0.0-1480]
|
at com.couchbase.analytics.control.rebalance.Rebalance.doCall(Rebalance.java:139) [cbas-server.jar:6.0.0-1480]
|
at com.couchbase.analytics.control.rebalance.Rebalance.doCall(Rebalance.java:74) [cbas-server.jar:6.0.0-1480]
|
at com.couchbase.analytics.runtime.WriteLockCallable.call(WriteLockCallable.java:21) [cbas-connector.jar:6.0.0-1480]
|
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_181]
|
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
|
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
|
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
|
2018-08-10T00:41:52.237-07:00 ERRO CBAS.servlet.RebalanceServlet [HttpExecutor(port:9111)-1] Rebalance 8e784bec9eff91cbfc5a5e51a50a734c failed
|
Subsequent rebalances also failed with same error
Attachments
Issue Links
- relates to
-
MB-30808 [System Test] Analytics Rebalance failure : HYR0003: HYR0114: Node is not active
- Closed