Details
-
Bug
-
Resolution: Fixed
-
Major
-
7.2.2
-
Untriaged
-
0
-
Unknown
-
Analytics Sprint 31, Analytics Sprint 32
Description
Internal error is observed when length() function is called on an invalid UTF-8 sequence. For example, the following query returns an internal error
select string_length("xxxxxxxxxx x??\uDEAD"); |
In the logs we can see the decoding error
Caused by: java.lang.IllegalArgumentException: Decoding error: got a low surrogate without a leading high surrogate at org.apache.hyracks.util.string.UTF8StringUtil.codePointSize(UTF8StringUtil.java:127) ~[hyracks-util-7.2.2-6401.jar:7.2.2-6401] at org.apache.hyracks.util.string.UTF8StringUtil.getNumCodePoint(UTF8StringUtil.java:214) ~[hyracks-util-7.2.2-6401.jar:7.2.2-6401] at org.apache.asterix.runtime.evaluators.functions.StringLengthDescriptor$1$1.evaluate(StringLengthDescriptor.java:88) ~
|
In case of invalid UTF-8 strings length should return null instead of an error.
Attachments
Issue Links
- links to