Fixed
Pinned fields
Click on the next to a field label to start pinning.
Details
Assignee
Umang AgrawalUmang Agrawal(Deactivated)Reporter
Ali AlsulimanAli AlsulimanIs this a Regression?
UnknownTriage
UntriagedStory Points
0Sprint
NonePriority
MajorInstabug
Open Instabug
Details
Details
Assignee
Umang Agrawal
Umang Agrawal(Deactivated)Reporter
Ali Alsuliman
Ali AlsulimanIs this a Regression?
Unknown
Triage
Untriaged
Story Points
0
Sprint
None
Priority
Instabug
Open Instabug
PagerDuty
PagerDuty
PagerDuty
Sentry
Sentry
Sentry
Zendesk Support
Zendesk Support
Zendesk Support
Created June 23, 2023 at 5:24 PM
Updated August 31, 2024 at 11:05 AM
Resolved June 24, 2023 at 2:14 AM
The ResultWriterOperatorDescriptor is the operator that persists the query result to disk. Each partition persists its portion of the result by serializing the tuples (that are in ADM format) as JSON strings into a byte array. This byte array that represents the JSON is added into a frame that is used to write the accumulated tuples to the result file. If the byte array is added to the frame successfully, the byte array is reset and the next tuple is serialized into it. However, if the byte array couldn't be added to the frame because the frame is full, the frame is flushed to disk and is emptied but also the byte array is reset at the same time. This leads to having to re-serialized the tuple again into the byte array and adding it to the frame. This becomes expensive especially for large tuples.
The byte array should not be reset upon flushing the frame when the frame cannot hold it. Instead, the frame should be flushed without reseting the byte array. Then, adding the byte array to the frame should be attempted again. When adding the byte array is successful, it should be reset.
Issue
Resolution
Query results could be unnecessarily converted twice to JSON when documents were large.
The Query result is now converted to JSON once for all documents.