Details
-
Bug
-
Resolution: Not a Bug
-
Major
-
Columnar 1.0.0
-
1.0.0-2209
-
Untriaged
-
0
-
Unknown
-
Analytics Sprint 46
Description
It looks like post a scale up operation ( from 16 to 32 nodes), the cluster is seen to be unusable. The sequence of events -
Rebalance from 16 to 32 gets triggered at -
2024-07-15T16:45:11.318 |
This completes at -
2024-07-15T17:01:24.681Z |
There are some exceptions seen around 17:35 (unsure if they are of any importance/relevance)
2024-07-15T17:35:53.442+00:00 WARN CBAS.dataflow.FeedRecordDataFlowController [SAO:JID:0.4459:TAID:TID:ANID:ODID:170:0:384:0:(linkZcZKRcJl/default1)[384]:BO] data flow controller interrupted |
java.lang.InterruptedException: null |
at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1640) ~[?:?] |
at java.base/java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:435) ~[?:?] |
at com.couchbase.analytics.adapter.CouchbaseConnector.pollNextMessage(CouchbaseConnector.java:844) ~[columnar-connector.jar:1.0.0-2209] |
at com.couchbase.analytics.adapter.CouchbaseConnector.next(CouchbaseConnector.java:810) ~[columnar-connector.jar:1.0.0-2209] |
at org.apache.asterix.external.dataflow.FeedRecordDataFlowController.next(FeedRecordDataFlowController.java:139) ~[asterix-external-data.jar:1.0.0-2209] |
at org.apache.asterix.external.dataflow.FeedRecordDataFlowController.start(FeedRecordDataFlowController.java:88) ~[asterix-external-data.jar:1.0.0-2209] |
at org.apache.asterix.external.dataset.adapter.FeedAdapter.start(FeedAdapter.java:41) ~[asterix-external-data.jar:1.0.0-2209] |
at org.apache.asterix.common.external.IDataSourceAdapter.start(IDataSourceAdapter.java:75) ~[asterix-common.jar:1.0.0-2209] |
at com.couchbase.analytics.runtime.BucketOperatorNodePushable.start(BucketOperatorNodePushable.java:50) ~[columnar-connector.jar:1.0.0-2209] |
at org.apache.asterix.active.ActiveSourceOperatorNodePushable.initialize(ActiveSourceOperatorNodePushable.java:101) ~[asterix-active.jar:1.0.0-2209] |
at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$runInParallel$0(SuperActivityOperatorNodePushable.java:233) ~[hyracks-api.jar:1.0.0-2209] |
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] |
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?] |
Then we start seeing cluster unusable messages -
2024-07-15T17:37:24.406+00:00 WARN CBAS.server.QueryServiceServlet [HttpExecutor(port:18095)-10] handleException: ASX0032: Cannot execute request, cluster is UNUSABLE: uuid=null, clientContextID=96f8c7c8-a113-40f1-bce9-ca5867be508f |
2024-07-15T17:37:56.723+00:00 INFO CBAS.server.QueryServiceServlet [HttpExecutor(port:18095)-11] handleRequest: uuid=9cdc2d98-dae2-440f-b7ae-4cd9010d8fbd, clientContextID=null, |
|
|
2024-07-15T17:42:54.438+00:00 INFO CBAS.messaging.NCMessageBroker [Worker:9d1bf4c6302db62e3f570c2df2678cd9] Received message: ExecuteStatementResponseMessage(id=397, uuid=f20ff46c-1bb1-4eac-8925-8b86e382afb9, clientContextId=null): 0 characters |
2024-07-15T17:42:54.439+00:00 WARN CBAS.server.QueryServiceServlet [HttpExecutor(port:18095)-8] handleException: ASX0032: Cannot execute request, cluster is UNUSABLE: uuid=null, clientContextID=f20ff46c-1bb1-4eac-8925-8b86e382afb9 |
2024-07-15T17:43:20.183+00:00 INFO CBAS.cbas updating |
This looks different from https://issues.couchbase.com/browse/MB-62680 because this was a rebalance-in. If it's the root cause is the same please close this out as duplicate.
cbcollect ->
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-001.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-002.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-003.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-004.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-005.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-006.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-007.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-008.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-009.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-010.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-011.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-012.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-013.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-014.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-015.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-016.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-017.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-018.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-019.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-020.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-021.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-022.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-023.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-024.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-025.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-026.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-027.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-028.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-029.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-030.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-031.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar15July/collectinfo-2024-07-15T182549-ns_1%40svc-da-node-032.twi3gef5x8hk6evi.sandbox.nonprod-project-avengers.com.zip
Attachments
Issue Links
- is caused by
-
MB-62740 [System Test] Terminating due to java.lang.OutOfMemoryError: Java heap space error seen
- Closed