[BP 7.1.5] Result tuples unnecessarily serialized twice

Description

The ResultWriterOperatorDescriptor is the operator that persists the query result to disk. Each partition persists its portion of the result by serializing the tuples (that are in ADM format) as JSON strings into a byte array. This byte array that represents the JSON is added into a frame that is used to write the accumulated tuples to the result file. If the byte array is added to the frame successfully, the byte array is reset and the next tuple is serialized into it. However, if the byte array couldn't be added to the frame because the frame is full, the frame is flushed to disk and is emptied but also the byte array is reset at the same time. This leads to having to re-serialized the tuple again into the byte array and adding it to the frame. This becomes expensive especially for large tuples.

The byte array should not be reset upon flushing the frame when the frame cannot hold it. Instead, the frame should be flushed without reseting the byte array. Then, adding the byte array to the frame should be attempted again. When adding the byte array is successful, it should be reset.

Issue	Resolution
Query results could be unnecessarily converted twice to JSON when documents were large.	The Query result is now converted to JSON once for all documents.

Labels

Environment

None

Link to Log File, atop/blg, CBCollectInfo, Core dump

None

Release Notes Description

None

Linked issues

clones

MB-56701

[CX] Result tuples unnecessarily serialized twice

Activity

Show:

Ali Alsuliman July 5, 2023 at 7:35 PM

Covered by dev test.

Ali Alsuliman June 24, 2023 at 2:13 AM

7.2.1 merge commits:
https://github.com/couchbase/asterixdb/commit/7fa0131efd2d34ff26d8c1c3a797e3467bea4ae5
https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/17615

Ali Alsuliman June 24, 2023 at 1:15 AM

7.1.5 commits:
https://github.com/couchbase/asterixdb/commit/06e4a33215c7df3879ff74a6469b75e946aabfcb
https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/17567

Fixed

Pinned fields

Click on the next to a field label to start pinning.

Details
Assignee
Umang Agrawal(Deactivated)
Reporter
Ali Alsuliman
Is this a Regression?
Unknown
Triage
Untriaged
Story Points
0
Sprint
None
Priority
Major
Instabug
Open Instabug

PagerDuty

Sentry

Zendesk Support

Created June 23, 2023 at 5:24 PM

Updated August 31, 2024 at 11:05 AM

Resolved June 24, 2023 at 2:14 AM

Configure

Instabug

[BP 7.1.5] Result tuples unnecessarily serialized twice

Description

Components

Affects versions

Fix versions

Labels

Environment

Link to Log File, atop/blg, CBCollectInfo, Core dump

Release Notes Description

Linked issues

clones

Activity

Ali Alsuliman July 5, 2023 at 7:35 PM

Ali Alsuliman June 24, 2023 at 2:13 AM

Ali Alsuliman June 24, 2023 at 1:15 AM

DetailsAssigneeUmang AgrawalUmang Agrawal(Deactivated)ReporterAli AlsulimanAli AlsulimanIs this a Regression?UnknownTriageUntriagedStory Points0SprintNone+1PriorityMajorInstabugOpen Instabug

Details

Assignee

Reporter

Is this a Regression?

Triage

Story Points

Sprint

Priority

Instabug

PagerDutyPagerDuty Incident

PagerDuty

Sentry Linked Issues

Sentry

Zendesk SupportLinked Tickets

Zendesk Support

Details
Assignee
Umang Agrawal(Deactivated)
Reporter
Ali Alsuliman
Is this a Regression?
Unknown
Triage
Untriaged
Story Points
0
Sprint
None
Priority
Major
Instabug
Open Instabug

PagerDuty

Sentry

Zendesk Support