Description
While doing NCBC-3669, I discovered that the because of the async nature of the StellarQueryResult/QueryClient and how gRPC works, retries will never happen. This is because the enumeration of the stream doesn't happen until the app layer starts the iteration. This is good from a performance perspective because the result is not put on the LOH but instead each iteration means that the current row is cleaned up in gen0.
Unfortunately, the gRPC doesn't expose a potential error (index not found) until iteration of the stream which is way after the retry handler has given back the results (this is thrown as an RpcException in the app layer). In order to properly have the error thrown and handled in the retry layer, we need to partially read the result before returning from the retry handler. In this way the RpcException will be throw and handled in the retry handler instead of in the application.
From what I can tell, we likely will have to read the metadata and the first row which will be cached and returned first to the app layer when the enumeration happens. This will happen when the request is sent in the retry handler. If there is a gRPC error, it will be caught and handled appropriately by the retry handler. The query will then either succeed on a retry or timeout.
Attachments
Issue Links
- duplicates
-
NCBC-3623 Fail fast when server returns Query error instead of while results are enumerated
- Closed
- is duplicated by
-
NCBC-3663 In PS RetryHandler DecodingFailureException thrown although status exists
- Closed
- relates to
-
NCBC-3669 Add StellarQueryClient so the failed queries can be retried
- Closed