Details
-
Bug
-
Resolution: Duplicate
-
Critical
-
Columnar 1.0.0
-
1.0.0-2045
-
Untriaged
-
0
-
Unknown
-
Analytics Sprint 43, Analytics Sprint 44
Description
During the system test run (it's a 2 node cluster), a query has failed with Internal server exception -
Query 1 (context ID 4206c5-2ba4-644a-170a-0c13840b5e812b) -
couchbase.exceptions.InternalServerFailureException: InternalServerFailureException(<ec=5, category=couchbase.common, message=internal_server_failure (5), context=AnalyticsErrorContext({'last_dispatched_to': '3.80.39.211:18095', 'last_dispatched_from': '172.17.0.2:49330', 'retry_attempts': 0, 'client_context_id': '4206c5-2ba4-644a-170a-0c13840b5e812b', 'method': 'POST', 'path': '/query/service', 'http_status': 500, 'http_body': '{\n\t"requestID": "407d6a45-fa9e-4fb5-80de-ddc9ccd30635",\n\t"clientContextID": "4206c5-2ba4-644a-170a-0c13840b5e812b",\n\t"errors": [{ \n\t\t"code": 25000,\t\t"msg": "Internal error"\t} \n\t],\n\t"status": "fatal",\n\t"metrics": {\n\t\t"elapsedTime": "449.212062318s",\n\t\t"executionTime": "449.210852624s",\n\t\t"compileTime": "0ns",\n\t\t"queueWaitTime": "0ns",\n\t\t"resultCount": 0,\n\t\t"resultSize": 0,\n\t\t"processedObjects": 0,\n\t\t"bufferCacheHitRatio": "0.00%",\n\t\t"bufferCachePageReadCount": 0,\n\t\t"errorCount": 1\n\t}\n}\n', 'first_error_code': 25000, 'first_error_message': 'Internal error', 'statement': "SELECT s.seller_name, p.product_name, p.avg_rating, CASE WHEN p.avg_rating = s.max_rating THEN 'Highest' WHEN p.avg_rating = s.min_rating THEN 'Lowest' END AS rating_type FROM `Database16PbMXZ`.`scope0waFjd`.`avroExternalDatasetdHshIpQLzB` p JOIN (SELECT seller_name, MAX(avg_rating) OVER (PARTITION BY seller_name) AS max_rating, MIN(avg_rating) OVER (PARTITION BY seller_name) AS min_rating FROM `Database16PbMXZ`.`scope0waFjd`.`avroExternalDatasetdHshIpQLzB`) s ON p.seller_name = s.seller_name WHERE p.avg_rating IN (s.max_rating, s.min_rating) ORDER BY s.seller_name, rating_type;", 'parameters': '{"client_context_id":"4206c5-2ba4-644a-170a-0c13840b5e812b","statement":"SELECT s.seller_name, p.product_name, p.avg_rating, CASE WHEN p.avg_rating = s.max_rating THEN \'Highest\' WHEN p.avg_rating = s.min_rating THEN \'Lowest\' END AS rating_type FROM `Database16PbMXZ`.`scope0waFjd`.`avroExternalDatasetdHshIpQLzB` p JOIN (SELECT seller_name, MAX(avg_rating) OVER (PARTITION BY seller_name) AS max_rating, MIN(avg_rating) OVER (PARTITION BY seller_name) AS min_rating FROM `Database16PbMXZ`.`scope0waFjd`.`avroExternalDatasetdHshIpQLzB`) s ON p.seller_name = s.seller_name WHERE p.avg_rating IN (s.max_rating, s.min_rating) ORDER BY s.seller_name, rating_type;","timeout":"900000ms"}', 'context_type': 'AnalyticsErrorContext'}), C Source=/tmp/pip-install-r493hp2k/couchbase_c4c9402d38bc44e498c8ad0f8a9fb1c5/src/analytics.cxx:301>) |
From the logs on node da-001, I could see a few log lines related to exceptions -
2024-05-16T06:25:36.992+00:00 WARN CBAS.server.QueryServiceServlet [HttpExecutor(port:18095)-7] handleException: unexpected exception: {"host":"svc-da-node-001.v2asqzwdtqzp1lde.sandbox.nonprod-project-avengers.com:18095","path":"/query/service","statement":"<ud>SELECT s.seller_name, p.product_name, p.avg_rating, CASE WHEN p.avg_rating = s.max_rating THEN 'Highest' WHEN p.avg_rating = s.min_rating THEN 'Lowest' END AS rating_type FROM `Database16PbMXZ`.`scope0waFjd`.`avroExternalDatasetdHshIpQLzB` p JOIN (SELECT seller_name, MAX(avg_rating) OVER (PARTITION BY seller_name) AS max_rating, MIN(avg_rating) OVER (PARTITION BY seller_name) AS min_rating FROM `Database16PbMXZ`.`scope0waFjd`.`avroExternalDatasetdHshIpQLzB`) s ON p.seller_name = s.seller_name WHERE p.avg_rating IN (s.max_rating, s.min_rating) ORDER BY s.seller_name, rating_type;</ud>","pretty":false,"mode":"immediate","clientContextID":"4206c5-2ba4-644a-170a-0c13840b5e812b","clientType":"ASTERIX","dataverse":null,"format":"CLEAN_JSON","timeout":900000,"maxResultReads":1,"planFormat":"JSON","expressionTree":false,"rewrittenExpressionTree":false,"logicalPlan":false,"optimizedLogicalPlan":false,"job":false,"profile":"counts","signature":true,"multiStatement":false,"parseOnly":false,"readOnly":false,"maxWarnings":0,"sqlCompat":false,"source":null,"scanConsistency":null,"scanWait":null} |
org.apache.hyracks.api.exceptions.HyracksDataException: org.apache.avro.AvroRuntimeException: java.net.SocketException: Connection reset
|
at org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:70) ~[hyracks-api.jar:1.0.0-2045] |
at org.apache.hyracks.api.util.ExceptionUtils.setNodeIds(ExceptionUtils.java:70) ~[hyracks-api.jar:1.0.0-2045] |
at org.apache.hyracks.control.nc.Task.run(Task.java:398) ~[hyracks-control-nc.jar:1.0.0-2045] |
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?] |
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?] |
at java.base/java.lang.Thread.run(Thread.java:840) [?:?] |
Caused by: org.apache.avro.AvroRuntimeException: java.net.SocketException: Connection reset
|
at org.apache.avro.file.DataFileStream.hasNextBlock(DataFileStream.java:312) ~[avro-1.11.3.jar:1.11.3] |
at org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:224) ~[avro-1.11.3.jar:1.11.3] |
at org.apache.asterix.external.input.record.reader.stream.AvroRecordReader.hasNext(AvroRecordReader.java:105) ~[asterix-external-data.jar:1.0.0-2045] |
at org.apache.asterix.external.dataflow.RecordDataFlowController.start(RecordDataFlowController.java:56) ~[asterix-external-data.jar:1.0.0-2045] |
at org.apache.asterix.external.dataset.adapter.GenericAdapter.start(GenericAdapter.java:38) ~[asterix-external-data.jar:1.0.0-2045] |
at org.apache.asterix.external.operators.ExternalScanOperatorDescriptor$1.initialize(ExternalScanOperatorDescriptor.java:80) ~[asterix-external-data.jar:1.0.0-2045] |
at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$runInParallel$0(SuperActivityOperatorNodePushable.java:233) ~[hyracks-api.jar:1.0.0-2045] |
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] |
... 3 more |
Caused by: java.net.SocketException: Connection reset
|
at java.base/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:328) ~[?:?] |
at java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:355) ~[?:?] |
at java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:808) ~[?:?] |
at java.base/java.net.Socket$SocketInputStream.read(Socket.java:966) ~[?:?] |
at java.base/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:484) ~[?:?] |
at java.base/sun.security.ssl.SSLSocketInputRecord.readFully(SSLSocketInputRecord.java:467) ~[?:?] |
at java.base/sun.security.ssl.SSLSocketInputRecord.decodeInputRecord(SSLSocketInputRecord.java:243) ~[?:?] |
at java.base/sun.security.ssl.SSLSocketInputRecord.decode(SSLSocketInputRecord.java:181) ~[?:?] |
at java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:111) ~[?:?] |
at java.base/sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1513) ~[?:?] |
at java.base/sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1484) ~[?:?] |
at java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:1069) ~[?:?] |
at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) ~[httpcore-4.4.16.jar:4.4.16] |
at org.apache.http.impl.io.SessionInputBufferImpl.read(SessionInputBufferImpl.java:197) ~[httpcore-4.4.16.jar:4.4.16] |
at org.apache.http.impl.io.ContentLengthInputStream.read(ContentLengthInputStream.java:176) ~[httpcore-4.4.16.jar:4.4.16] |
at org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:135) ~[httpclient-4.5.14.jar:4.5.14] |
at java.base/java.io.FilterInputStream.read(FilterInputStream.java:132) ~[?:?] |
at software.amazon.awssdk.services.s3.internal.checksums.S3ChecksumValidatingInputStream.read(S3ChecksumValidatingInputStream.java:112) ~[s3-2.24.9.jar:?] |
at java.base/java.io.FilterInputStream.read(FilterInputStream.java:132) ~[?:?] |
at software.amazon.awssdk.core.io.SdkFilterInputStream.read(SdkFilterInputStream.java:66) ~[sdk-core-2.24.9.jar:?] |
at software.amazon.awssdk.core.internal.metrics.BytesReadTrackingInputStream.read(BytesReadTrackingInputStream.java:49) ~[sdk-core-2.24.9.jar:?] |
at java.base/java.io.FilterInputStream.read(FilterInputStream.java:132) ~[?:?] |
at software.amazon.awssdk.core.io.SdkFilterInputStream.read(SdkFilterInputStream.java:66) ~[sdk-core-2.24.9.jar:?] |
at org.apache.asterix.external.input.stream.DiscretizedMultipleInputStream$MultipleStreamWrapper.read(DiscretizedMultipleInputStream.java:90) ~[asterix-external-data.jar:1.0.0-2045] |
at org.apache.asterix.external.input.stream.DiscretizedMultipleInputStream.read(DiscretizedMultipleInputStream.java:48) ~[asterix-external-data.jar:1.0.0-2045] |
at org.apache.avro.io.BinaryDecoder$InputStreamByteSource.tryReadRaw(BinaryDecoder.java:867) ~[avro-1.11.3.jar:1.11.3] |
at org.apache.avro.io.BinaryDecoder.isEnd(BinaryDecoder.java:490) ~[avro-1.11.3.jar:1.11.3] |
at org.apache.avro.file.DataFileStream.hasNextBlock(DataFileStream.java:299) ~[avro-1.11.3.jar:1.11.3] |
at org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:224) ~[avro-1.11.3.jar:1.11.3] |
at org.apache.asterix.external.input.record.reader.stream.AvroRecordReader.hasNext(AvroRecordReader.java:105) ~[asterix-external-data.jar:1.0.0-2045] |
at org.apache.asterix.external.dataflow.RecordDataFlowController.start(RecordDataFlowController.java:56) ~[asterix-external-data.jar:1.0.0-2045] |
at org.apache.asterix.external.dataset.adapter.GenericAdapter.start(GenericAdapter.java:38) ~[asterix-external-data.jar:1.0.0-2045] |
at org.apache.asterix.external.operators.ExternalScanOperatorDescriptor$1.initialize(ExternalScanOperatorDescriptor.java:80) ~[asterix-external-data.jar:1.0.0-2045] |
at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$runInParallel$0(SuperActivityOperatorNodePushable.java:233) ~[hyracks-api.jar:1.0.0-2045] |
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] |
... 3 more |
Logs ->
https://cb-engineering.s3.amazonaws.com/SysTestColumnar16May/collectinfo-2024-05-16T064205-ns_1%40svc-da-node-001.v2asqzwdtqzp1lde.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/SysTestColumnar16May/collectinfo-2024-05-16T064205-ns_1%40svc-da-node-002.v2asqzwdtqzp1lde.sandbox.nonprod-project-avengers.com.zip
Attachments
Issue Links
- is duplicated by
-
MB-62299 Merge operation failures, tcp io errors are seen while facing network issues between s3 and columnar.
- Closed
-
MB-62021 [System Test] Connection reset errors while running a query on an external dataset
- Closed
- relates to
-
MB-62144 Ensure exceptions are handled properly when reading Avro data format
- Closed