Details
-
Bug
-
Resolution: Duplicate
-
Major
-
Columnar 1.0.0
-
1.0.0-2187 4-node cluster (64 GB + 16 vcpus)
-
Untriaged
-
0
-
Unknown
-
Analytics Sprint 45
Description
There appear to be 2 different type of illegalStateExceptions -
Occurrence 1 (looks like to have come from Kafka links) -
On node 003
2024-06-30T12:54:39.202+00:00 INFO CBAS.runtime.TopicOffsetUpdateCallback [Executor-2004:472f10b654ebabf98903a78f556bce05] triggering flush if needed on {"dir" : "/var/cb-cache/@analytics/v_iodevice_8/storage/partition_104/Default/Default/LinkedDatasetEdTuSczvQQ/0/LinkedDatasetEdTuSczvQQ", "memory" : [{"state":"READABLE_WRITABLE", "writers":0, "readers":0, "pendingFlushes":0, "id":"[27,27]", "index":{"class": "BTree", "file": "storage/partition_104/Default/Default/LinkedDatasetEdTuSczvQQ/0/LinkedDatasetEdTuSczvQQ_virtual_0"}}, {"state":"INACTIVE", "writers":0, "readers":0, "pendingFlushes":0, "id":"null", "index":{"class": "BTree", "file": "storage/partition_104/Default/Default/LinkedDatasetEdTuSczvQQ/0/LinkedDatasetEdTuSczvQQ_virtual_1"}}], "disk" : 3, "num-scheduled-flushes":0, "current-memory-component":0} to persist the topic state. reason stopping ingestion |
2024-06-30T12:54:39.203+00:00 WARN CBAS.util.CleanupUtils [SAO:JID:0.10846:TAID:TID:ANID:ODID:15:0:16:0] Failure destroying a destroyable resource |
java.lang.IllegalStateException: Cannot destroy a cursor in the state OPENED
|
at org.apache.hyracks.storage.common.EnforcedIndexCursor.destroy(EnforcedIndexCursor.java:93) ~[hyracks-storage-common.jar:1.0.0-2187] |
at org.apache.hyracks.storage.am.lsm.common.impls.LSMIndexSearchCursor.doDestroy(LSMIndexSearchCursor.java:163) ~[hyracks-storage-am-lsm-common.jar:1.0.0-2187] |
at org.apache.hyracks.storage.common.EnforcedIndexCursor.destroy(EnforcedIndexCursor.java:98) ~[hyracks-storage-common.jar:1.0.0-2187] |
at org.apache.hyracks.storage.am.lsm.btree.impls.LSMBTreeSearchCursor.doDestroy(LSMBTreeSearchCursor.java:81) ~[hyracks-storage-am-lsm-btree.jar:1.0.0-2187] |
at org.apache.hyracks.storage.common.EnforcedIndexCursor.destroy(EnforcedIndexCursor.java:98) ~[hyracks-storage-common.jar:1.0.0-2187] |
at org.apache.hyracks.api.util.CleanupUtils.destroy(CleanupUtils.java:38) ~[hyracks-api.jar:1.0.0-2187] |
at org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.releaseResources(IndexSearchOperatorNodePushable.java:371) ~[hyracks-storage-am-common.jar:1.0.0-2187] |
at org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.close(IndexSearchOperatorNodePushable.java:331) ~[hyracks-storage-am-common.jar:1.0.0-2187] |
at org.apache.hyracks.algebricks.runtime.operators.std.EmptyTupleSourceRuntimeFactory$1.close(EmptyTupleSourceRuntimeFactory.java:61) ~[algebricks-runtime.jar:1.0.0-2187] |
at org.apache.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor$SourcePushRuntime.initialize(AlgebricksMetaOperatorDescriptor.java:181) ~[algebricks-runtime.jar:1.0.0-2187] |
at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$runInParallel$0(SuperActivityOperatorNodePushable.java:233) ~[hyracks-api.jar:1.0.0-2187] |
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] |
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?] |
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?] |
at java.base/java.lang.Thread.run(Thread.java:840) [?:?] |
2024-06-30T12:54:39.203+00:00 WARN CBAS.runtime.TopicOperatorNodePushable [SAO:JID:0.10836:TAID:TID:ANID:ODID:80:0:9:0:(Default.KafkaLinkJSiNmbDZkD.b-3-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-2-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196,b-1-public.qekafkatestcluster.7b9vtv.c13.kafka.us-east-1.amazonaws.com:9196(CB))[9]:TO] ingestion failed |
Occurrence 2 ( from remote links)
2024-06-30T14:59:50.157+00:00 INFO CBAS.messaging.CCMessageBroker [Executor-191:ClusterController] Received message: {"class":"ShadowStatesResponse","failure":"java.lang.IllegalStateException: Active Runtime (linkoRAAeTpT/default1)[-1]:StateRequest-0xd is already registered"} |
2024-06-30T14:59:50.158+00:00 INFO CBAS.messaging.NCMessageBroker [Worker:472f10b654ebabf98903a78f556bce05] Received message: ShadowStatesRequest{(linkoRAAeTpT/default1)[-1]:StateRequest-0xd} |
2024-06-30T14:59:50.159+00:00 INFO CBAS.messaging.CCMessageBroker [Executor-192:ClusterController] Received message: {"class":"ShadowStatesResponse","failure":"java.lang.IllegalStateException: Active Runtime (linkoRAAeTpT/default1)[-1]:StateRequest-0xd is already registered"} |
2024-06-30T14:59:50.160+00:00 INFO CBAS.messaging.CCMessageBroker [Executor-193:ClusterController] Received message: {"class":"ShadowStatesResponse","failure":"java.lang.IllegalStateException: Active Runtime (linkoRAAeTpT/default1)[-1]:StateRequest-0xc is already registered"} |
2024-06-30T14:59:50.161+00:00 INFO CBAS.messaging.CCMessageBroker [Executor-194:ClusterController] Received message: {"class":"ShadowStatesResponse","failure":"java.lang.IllegalStateException: Active Runtime (linkoRAAeTpT/default1)[-1]:StateRequest-0xc is already registered"} |
2024-06-30T14:59:50.162+00:00 INFO CBAS.messaging.NCMessageBroker [Worker:472f10b654ebabf98903a78f556bce05] Received message: ShadowStatesRequest{(linkoRAAeTpT/default1)[-1]:StateRequest-0xc} |
2024-06-30T14:59:50.162+00:00 INFO CBAS.messaging.CCMessageBroker [Executor-191:ClusterController] Received message: {"class":"ShadowStatesResponse","failure":"java.lang.IllegalStateException: Active Runtime (linkoRAAeTpT/default1)[-1]:StateRequest-0xd is already registered"} |
2024-06-30T14:59:50.165+00:00 INFO CBAS.messaging.CCMessageBroker [Executor-194:ClusterController] Received message: {"class":"ShadowStatesResponse","failure":"java.lang.IllegalStateException: Active Runtime (linkoRAAeTpT/default1)[-1]:StateRequest-0xc is already registered"} |
2024-06-30T14:59:50.168+00:00 INFO CBAS.messaging.NCMessageBroker [Worker:472f10b654ebabf98903a78f556bce05] Received message: ShadowStatesRequest{(linkgLNAoMEM/default1)[-1]:StateRequest-0xc} |
2024-06-30T14:59:50.172+00:00 INFO CBAS.messaging.NCMessageBroker [Worker:472f10b654ebabf98903a78f556bce05] Received message: ShadowStatesRequest{(linkgLNAoMEM/default1)[-1]:StateRequest-0xd} |
2024-06-30T14:59:50.173+00:00 INFO CBAS.messaging.CCMessageBroker [Executor-194:ClusterController] Received message: {"class":"ShadowStatesResponse","failure":"java.lang.IllegalStateException: Active Runtime (linkoRAAeTpT/default1)[-1]:StateRequest-0xd is already registered"} |
2024-06-30T14:59:50.175+00:00 INFO CBAS.messaging.CCMessageBroker [Executor-194:ClusterController] Received message: {"class":"ShadowStatesResponse","failure":"java.lang.IllegalStateException: Active Runtime (linkoRAAeTpT/default1)[-1]:StateRequest-0xc is already registered"} |
These exceptions were seen on other nodes as well ( 001, 006, 007, 008 for examples)
The test was in its quiet period, so not really sure if this had any functional impact and thus I'm marking it Major. Please increase/decrease priority if necessary.
cbcollect ->
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJun28/collectinfo-2024-06-30T181121-ns_1%40svc-da-node-001.bks3edqzezfgtl1s.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJun28/collectinfo-2024-06-30T181121-ns_1%40svc-da-node-002.bks3edqzezfgtl1s.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJun28/collectinfo-2024-06-30T181121-ns_1%40svc-da-node-003.bks3edqzezfgtl1s.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJun28/collectinfo-2024-06-30T181121-ns_1%40svc-da-node-004.bks3edqzezfgtl1s.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJun28/collectinfo-2024-06-30T181121-ns_1%40svc-da-node-005.bks3edqzezfgtl1s.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJun28/collectinfo-2024-06-30T181121-ns_1%40svc-da-node-006.bks3edqzezfgtl1s.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJun28/collectinfo-2024-06-30T181121-ns_1%40svc-da-node-007.bks3edqzezfgtl1s.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJun28/collectinfo-2024-06-30T181121-ns_1%40svc-da-node-008.bks3edqzezfgtl1s.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJun28/collectinfo-2024-06-30T181121-ns_1%40svc-da-node-009.bks3edqzezfgtl1s.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJun28/collectinfo-2024-06-30T181121-ns_1%40svc-da-node-010.bks3edqzezfgtl1s.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJun28/collectinfo-2024-06-30T181121-ns_1%40svc-da-node-011.bks3edqzezfgtl1s.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJun28/collectinfo-2024-06-30T181121-ns_1%40svc-da-node-012.bks3edqzezfgtl1s.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJun28/collectinfo-2024-06-30T181121-ns_1%40svc-da-node-013.bks3edqzezfgtl1s.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJun28/collectinfo-2024-06-30T181121-ns_1%40svc-da-node-014.bks3edqzezfgtl1s.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJun28/collectinfo-2024-06-30T181121-ns_1%40svc-da-node-015.bks3edqzezfgtl1s.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnarJun28/collectinfo-2024-06-30T181121-ns_1%40svc-da-node-016.bks3edqzezfgtl1s.sandbox.nonprod-project-avengers.com.zip