Details
-
Bug
-
Resolution: Fixed
-
Critical
-
6.0.0
-
centos cluster (longevity)
-
Untriaged
-
-
Unknown
-
CX Sprint 114
Description
Build : 6.0.0-1432
Test : -test tests/integration/test_allFeatures_alice.yml -scope tests/integration/scope_Xattrs_Alice.yml
Scale : 3
Seeing this issue several times with the system test run on the latest build. The latest being Rebalance ID d958ccd453cca075ef38f72b4c8915ea.
Analytics rebalance causes failures for rebalance operations involving other services as well. Like GSI, Analytics service should also refrain from rebalancing analytics nodes when the rebalance operation is initiated for nodes of other services.
Also, when disconnecting the link, it would be good to ensure DCP states on all partitions are balanced, even if it delays the disconnect operation, so that issues like these can be avoided.
Seeing the following in the analytics_error.log file on 172.23.96.145
2018-08-05T12:45:56.027-07:00 ERRO CBAS.metadata.BucketEventsListener [Executor-571:ClusterController] Failed to connect bucket Default.Local.CUSTOMER(CouchbaseMetadataExtension)
|
java.lang.NullPointerException: null
|
2018-08-05T12:46:24.561-07:00 ERRO CBAS.metadata.BucketEventsListener [Executor-657:ClusterController] Failed to connect bucket Default.Local.CUSTOMER(CouchbaseMetadataExtension)
|
java.lang.NullPointerException: null
|
2018-08-05T12:47:09.721-07:00 ERRO CBAS.rebalance.Rebalance [Executor-586:ClusterController] rebalance failed
|
com.couchbase.analytics.common.exceptions.AnalyticsHyracksException: CBAS0001: Datasets in different partitions have different DCP states. Mutations needed to catch up = 234581. User action: Connect the bucket: { "class" : "Bucket", "dataverse" : "Default", "link" : "Local", "bucket" : "default", "uuid" : "0e91fbf6d20c5b4a6456222cc2c45ab4", "running" : false } or drop the dataset: Default.ds1
|
at com.couchbase.analytics.control.rebalance.ShadowStateWriteCallback.beforeRebalance(ShadowStateWriteCallback.java:89) ~[cbas-server.jar:6.0.0-1435]
|
at org.apache.asterix.utils.RebalanceUtil.rebalance(RebalanceUtil.java:220) ~[asterix-app.jar:6.0.0-1435]
|
at org.apache.asterix.utils.RebalanceUtil.rebalance(RebalanceUtil.java:131) ~[asterix-app.jar:6.0.0-1435]
|
at com.couchbase.analytics.control.rebalance.Rebalance.rebalanceDataset(Rebalance.java:403) ~[cbas-server.jar:6.0.0-1435]
|
at com.couchbase.analytics.control.rebalance.Rebalance.rebalanceDatasets(Rebalance.java:237) ~[cbas-server.jar:6.0.0-1435]
|
at com.couchbase.analytics.control.rebalance.Rebalance.lambda$doRebalance$3(Rebalance.java:170) ~[cbas-server.jar:6.0.0-1435]
|
at org.apache.hyracks.api.util.InvokeUtil.tryWithCleanups(InvokeUtil.java:191) ~[hyracks-api.jar:6.0.0-1435]
|
at com.couchbase.analytics.control.rebalance.Rebalance.doRebalance(Rebalance.java:166) ~[cbas-server.jar:6.0.0-1435]
|
at com.couchbase.analytics.control.rebalance.Rebalance.doCall(Rebalance.java:130) [cbas-server.jar:6.0.0-1435]
|
at com.couchbase.analytics.control.rebalance.Rebalance.doCall(Rebalance.java:70) [cbas-server.jar:6.0.0-1435]
|
at com.couchbase.analytics.runtime.WriteLockCallable.call(WriteLockCallable.java:21) [cbas-connector.jar:6.0.0-1435]
|
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_181]
|
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
|
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
|
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
|
2018-08-05T12:47:10.426-07:00 ERRO CBAS.servlet.RebalanceServlet [HttpExecutor(port:9111)-2] Rebalance d958ccd453cca075ef38f72b4c8915ea failed
|
com.couchbase.analytics.common.exceptions.AnalyticsHyracksException: CBAS0001: Datasets in different partitions have different DCP states. Mutations needed to catch up = 234581. User action: Connect the bucket: { "class" : "Bucket", "dataverse" : "Default", "link" : "Local", "bucket" : "default", "uuid" : "0e91fbf6d20c5b4a6456222cc2c45ab4", "running" : false } or drop the dataset: Default.ds1
|
at com.couchbase.analytics.control.rebalance.ShadowStateWriteCallback.beforeRebalance(ShadowStateWriteCallback.java:89) ~[cbas-server.jar:6.0.0-1435]
|
at org.apache.asterix.utils.RebalanceUtil.rebalance(RebalanceUtil.java:220) ~[asterix-app.jar:6.0.0-1435]
|
at org.apache.asterix.utils.RebalanceUtil.rebalance(RebalanceUtil.java:131) ~[asterix-app.jar:6.0.0-1435]
|
at com.couchbase.analytics.control.rebalance.Rebalance.rebalanceDataset(Rebalance.java:403) ~[cbas-server.jar:6.0.0-1435]
|
at com.couchbase.analytics.control.rebalance.Rebalance.rebalanceDatasets(Rebalance.java:237) ~[cbas-server.jar:6.0.0-1435]
|
at com.couchbase.analytics.control.rebalance.Rebalance.lambda$doRebalance$3(Rebalance.java:170) ~[cbas-server.jar:6.0.0-1435]
|
at org.apache.hyracks.api.util.InvokeUtil.tryWithCleanups(InvokeUtil.java:191) ~[hyracks-api.jar:6.0.0-1435]
|
at com.couchbase.analytics.control.rebalance.Rebalance.doRebalance(Rebalance.java:166) ~[cbas-server.jar:6.0.0-1435]
|
at com.couchbase.analytics.control.rebalance.Rebalance.doCall(Rebalance.java:130) ~[cbas-server.jar:6.0.0-1435]
|
at com.couchbase.analytics.control.rebalance.Rebalance.doCall(Rebalance.java:70) ~[cbas-server.jar:6.0.0-1435]
|
at com.couchbase.analytics.runtime.WriteLockCallable.call(WriteLockCallable.java:21) ~[cbas-connector.jar:6.0.0-1435]
|
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_181]
|
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
|
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
|
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
|
2018-08-05T12:47:28.610-07:00 ERRO CBAS.metadata.BucketEventsListener [Executor-345:ClusterController] Failed to connect bucket Default.Local.CUSTOMER(CouchbaseMetadataExtension)
|
java.lang.NullPointerException: null
|
2018-08-05T12:48:18.334-07:00 ERRO CBAS.metadata.BucketEventsListener [Executor-659:ClusterController] Failed to connect bucket Default.Local.CUSTOMER(CouchbaseMetadataExtension)
|
java.lang.NullPointerException: null
|
|
Attachments
Issue Links
- is duplicated by
-
MB-30767 [System Test] Rebalance operation for index service got hung because of analytics node rebalance
- Closed
- relates to
-
MB-50690 Rebalance is reported as completed successfully even when analytics reports unsuccessful rebalance
- Closed
-
MB-30767 [System Test] Rebalance operation for index service got hung because of analytics node rebalance
- Closed
-
MB-41910 [CX] Rebalance reports success even though it failed
- Closed