Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-61915

[System Test] InternalServerFailureException seen on an external Avro dataset

    XMLWordPrintable

Details

    • Untriaged
    • 0
    • Unknown
    • Analytics Sprint 43, Analytics Sprint 44

    Description

      During the system test run (it's a 2 node cluster), a query has failed with Internal server exception -

      Query 1 (context ID 4206c5-2ba4-644a-170a-0c13840b5e812b) -

      couchbase.exceptions.InternalServerFailureException: InternalServerFailureException(<ec=5, category=couchbase.common, message=internal_server_failure (5), context=AnalyticsErrorContext({'last_dispatched_to': '3.80.39.211:18095', 'last_dispatched_from': '172.17.0.2:49330', 'retry_attempts': 0, 'client_context_id': '4206c5-2ba4-644a-170a-0c13840b5e812b', 'method': 'POST', 'path': '/query/service', 'http_status': 500, 'http_body': '{\n\t"requestID": "407d6a45-fa9e-4fb5-80de-ddc9ccd30635",\n\t"clientContextID": "4206c5-2ba4-644a-170a-0c13840b5e812b",\n\t"errors": [{ \n\t\t"code": 25000,\t\t"msg": "Internal error"\t} \n\t],\n\t"status": "fatal",\n\t"metrics": {\n\t\t"elapsedTime": "449.212062318s",\n\t\t"executionTime": "449.210852624s",\n\t\t"compileTime": "0ns",\n\t\t"queueWaitTime": "0ns",\n\t\t"resultCount": 0,\n\t\t"resultSize": 0,\n\t\t"processedObjects": 0,\n\t\t"bufferCacheHitRatio": "0.00%",\n\t\t"bufferCachePageReadCount": 0,\n\t\t"errorCount": 1\n\t}\n}\n', 'first_error_code': 25000, 'first_error_message': 'Internal error', 'statement': "SELECT s.seller_name, p.product_name, p.avg_rating, CASE WHEN p.avg_rating = s.max_rating THEN 'Highest' WHEN p.avg_rating = s.min_rating THEN 'Lowest' END AS rating_type FROM `Database16PbMXZ`.`scope0waFjd`.`avroExternalDatasetdHshIpQLzB` p JOIN (SELECT seller_name, MAX(avg_rating) OVER (PARTITION BY seller_name) AS max_rating, MIN(avg_rating) OVER (PARTITION BY seller_name) AS min_rating FROM `Database16PbMXZ`.`scope0waFjd`.`avroExternalDatasetdHshIpQLzB`) s ON p.seller_name = s.seller_name WHERE p.avg_rating IN (s.max_rating, s.min_rating) ORDER BY s.seller_name, rating_type;", 'parameters': '{"client_context_id":"4206c5-2ba4-644a-170a-0c13840b5e812b","statement":"SELECT s.seller_name, p.product_name, p.avg_rating, CASE WHEN p.avg_rating = s.max_rating THEN \'Highest\' WHEN p.avg_rating = s.min_rating THEN \'Lowest\' END AS rating_type FROM `Database16PbMXZ`.`scope0waFjd`.`avroExternalDatasetdHshIpQLzB` p JOIN (SELECT seller_name, MAX(avg_rating) OVER (PARTITION BY seller_name) AS max_rating, MIN(avg_rating) OVER (PARTITION BY seller_name) AS min_rating FROM `Database16PbMXZ`.`scope0waFjd`.`avroExternalDatasetdHshIpQLzB`) s ON p.seller_name = s.seller_name WHERE p.avg_rating IN (s.max_rating, s.min_rating) ORDER BY s.seller_name, rating_type;","timeout":"900000ms"}', 'context_type': 'AnalyticsErrorContext'}), C Source=/tmp/pip-install-r493hp2k/couchbase_c4c9402d38bc44e498c8ad0f8a9fb1c5/src/analytics.cxx:301>)
      

      From the logs on node da-001, I could see a few log lines related to exceptions -

      2024-05-16T06:25:36.992+00:00 WARN CBAS.server.QueryServiceServlet [HttpExecutor(port:18095)-7] handleException: unexpected exception: {"host":"svc-da-node-001.v2asqzwdtqzp1lde.sandbox.nonprod-project-avengers.com:18095","path":"/query/service","statement":"<ud>SELECT s.seller_name, p.product_name, p.avg_rating, CASE WHEN p.avg_rating = s.max_rating THEN 'Highest' WHEN p.avg_rating = s.min_rating THEN 'Lowest' END AS rating_type FROM `Database16PbMXZ`.`scope0waFjd`.`avroExternalDatasetdHshIpQLzB` p JOIN (SELECT seller_name, MAX(avg_rating) OVER (PARTITION BY seller_name) AS max_rating, MIN(avg_rating) OVER (PARTITION BY seller_name) AS min_rating FROM `Database16PbMXZ`.`scope0waFjd`.`avroExternalDatasetdHshIpQLzB`) s ON p.seller_name = s.seller_name WHERE p.avg_rating IN (s.max_rating, s.min_rating) ORDER BY s.seller_name, rating_type;</ud>","pretty":false,"mode":"immediate","clientContextID":"4206c5-2ba4-644a-170a-0c13840b5e812b","clientType":"ASTERIX","dataverse":null,"format":"CLEAN_JSON","timeout":900000,"maxResultReads":1,"planFormat":"JSON","expressionTree":false,"rewrittenExpressionTree":false,"logicalPlan":false,"optimizedLogicalPlan":false,"job":false,"profile":"counts","signature":true,"multiStatement":false,"parseOnly":false,"readOnly":false,"maxWarnings":0,"sqlCompat":false,"source":null,"scanConsistency":null,"scanWait":null}
      org.apache.hyracks.api.exceptions.HyracksDataException: org.apache.avro.AvroRuntimeException: java.net.SocketException: Connection reset
      	at org.apache.hyracks.api.exceptions.HyracksDataException.create(HyracksDataException.java:70) ~[hyracks-api.jar:1.0.0-2045]
      	at org.apache.hyracks.api.util.ExceptionUtils.setNodeIds(ExceptionUtils.java:70) ~[hyracks-api.jar:1.0.0-2045]
      	at org.apache.hyracks.control.nc.Task.run(Task.java:398) ~[hyracks-control-nc.jar:1.0.0-2045]
      	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
      	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
      	at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
      Caused by: org.apache.avro.AvroRuntimeException: java.net.SocketException: Connection reset
      	at org.apache.avro.file.DataFileStream.hasNextBlock(DataFileStream.java:312) ~[avro-1.11.3.jar:1.11.3]
      	at org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:224) ~[avro-1.11.3.jar:1.11.3]
      	at org.apache.asterix.external.input.record.reader.stream.AvroRecordReader.hasNext(AvroRecordReader.java:105) ~[asterix-external-data.jar:1.0.0-2045]
      	at org.apache.asterix.external.dataflow.RecordDataFlowController.start(RecordDataFlowController.java:56) ~[asterix-external-data.jar:1.0.0-2045]
      	at org.apache.asterix.external.dataset.adapter.GenericAdapter.start(GenericAdapter.java:38) ~[asterix-external-data.jar:1.0.0-2045]
      	at org.apache.asterix.external.operators.ExternalScanOperatorDescriptor$1.initialize(ExternalScanOperatorDescriptor.java:80) ~[asterix-external-data.jar:1.0.0-2045]
      	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$runInParallel$0(SuperActivityOperatorNodePushable.java:233) ~[hyracks-api.jar:1.0.0-2045]
      	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
      	... 3 more
      Caused by: java.net.SocketException: Connection reset
      	at java.base/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:328) ~[?:?]
      	at java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:355) ~[?:?]
      	at java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:808) ~[?:?]
      	at java.base/java.net.Socket$SocketInputStream.read(Socket.java:966) ~[?:?]
      	at java.base/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:484) ~[?:?]
      	at java.base/sun.security.ssl.SSLSocketInputRecord.readFully(SSLSocketInputRecord.java:467) ~[?:?]
      	at java.base/sun.security.ssl.SSLSocketInputRecord.decodeInputRecord(SSLSocketInputRecord.java:243) ~[?:?]
      	at java.base/sun.security.ssl.SSLSocketInputRecord.decode(SSLSocketInputRecord.java:181) ~[?:?]
      	at java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:111) ~[?:?]
      	at java.base/sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1513) ~[?:?]
      	at java.base/sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1484) ~[?:?]
      	at java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:1069) ~[?:?]
      	at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) ~[httpcore-4.4.16.jar:4.4.16]
      	at org.apache.http.impl.io.SessionInputBufferImpl.read(SessionInputBufferImpl.java:197) ~[httpcore-4.4.16.jar:4.4.16]
      	at org.apache.http.impl.io.ContentLengthInputStream.read(ContentLengthInputStream.java:176) ~[httpcore-4.4.16.jar:4.4.16]
      	at org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:135) ~[httpclient-4.5.14.jar:4.5.14]
      	at java.base/java.io.FilterInputStream.read(FilterInputStream.java:132) ~[?:?]
      	at software.amazon.awssdk.services.s3.internal.checksums.S3ChecksumValidatingInputStream.read(S3ChecksumValidatingInputStream.java:112) ~[s3-2.24.9.jar:?]
      	at java.base/java.io.FilterInputStream.read(FilterInputStream.java:132) ~[?:?]
      	at software.amazon.awssdk.core.io.SdkFilterInputStream.read(SdkFilterInputStream.java:66) ~[sdk-core-2.24.9.jar:?]
      	at software.amazon.awssdk.core.internal.metrics.BytesReadTrackingInputStream.read(BytesReadTrackingInputStream.java:49) ~[sdk-core-2.24.9.jar:?]
      	at java.base/java.io.FilterInputStream.read(FilterInputStream.java:132) ~[?:?]
      	at software.amazon.awssdk.core.io.SdkFilterInputStream.read(SdkFilterInputStream.java:66) ~[sdk-core-2.24.9.jar:?]
      	at org.apache.asterix.external.input.stream.DiscretizedMultipleInputStream$MultipleStreamWrapper.read(DiscretizedMultipleInputStream.java:90) ~[asterix-external-data.jar:1.0.0-2045]
      	at org.apache.asterix.external.input.stream.DiscretizedMultipleInputStream.read(DiscretizedMultipleInputStream.java:48) ~[asterix-external-data.jar:1.0.0-2045]
      	at org.apache.avro.io.BinaryDecoder$InputStreamByteSource.tryReadRaw(BinaryDecoder.java:867) ~[avro-1.11.3.jar:1.11.3]
      	at org.apache.avro.io.BinaryDecoder.isEnd(BinaryDecoder.java:490) ~[avro-1.11.3.jar:1.11.3]
      	at org.apache.avro.file.DataFileStream.hasNextBlock(DataFileStream.java:299) ~[avro-1.11.3.jar:1.11.3]
      	at org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:224) ~[avro-1.11.3.jar:1.11.3]
      	at org.apache.asterix.external.input.record.reader.stream.AvroRecordReader.hasNext(AvroRecordReader.java:105) ~[asterix-external-data.jar:1.0.0-2045]
      	at org.apache.asterix.external.dataflow.RecordDataFlowController.start(RecordDataFlowController.java:56) ~[asterix-external-data.jar:1.0.0-2045]
      	at org.apache.asterix.external.dataset.adapter.GenericAdapter.start(GenericAdapter.java:38) ~[asterix-external-data.jar:1.0.0-2045]
      	at org.apache.asterix.external.operators.ExternalScanOperatorDescriptor$1.initialize(ExternalScanOperatorDescriptor.java:80) ~[asterix-external-data.jar:1.0.0-2045]
      	at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$runInParallel$0(SuperActivityOperatorNodePushable.java:233) ~[hyracks-api.jar:1.0.0-2045]
      	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
      	... 3 more
      

      Logs ->

      https://cb-engineering.s3.amazonaws.com/SysTestColumnar16May/collectinfo-2024-05-16T064205-ns_1%40svc-da-node-001.v2asqzwdtqzp1lde.sandbox.nonprod-project-avengers.com.zip
      https://cb-engineering.s3.amazonaws.com/SysTestColumnar16May/collectinfo-2024-05-16T064205-ns_1%40svc-da-node-002.v2asqzwdtqzp1lde.sandbox.nonprod-project-avengers.com.zip

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              Hussain.Towaileb Hussain Towaileb
              pavan.pb Pavan PB
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty