Details
-
Task
-
Resolution: Unresolved
-
Major
-
5.5.0
-
CX Sprint 114, CX Sprint 115, CX Sprint 116, CX Sprint 117, CX Sprint 118, CX Sprint 119, CX Sprint 120, CX Sprint 121, CX Sprint 122, CX Sprint 123, CX Sprint 124, CX Sprint 125, CX Sprint 126, CX Sprint 127, CX Sprint 128, CX Sprint 129, CX Sprint 130, CX Sprint 131, CX Sprint 132, CX Sprint 133, CX Sprint 134
Description
Currently, bucket connection has two parts, ingestion, and storage. When a problem happens at the ingestion side, its impact is minimal and is resolved by retrying. We have many tests that cover different failure scenarios when that happens.
However, if there is a failure in storage, then the transaction abort sequence is triggered. Where is the problem?
We use write ahead logs for our changes and the transaction abort part relies on these logs to reach a consistent state. We don't have those logs for bucket connections and so, a different mechanism should be in place to abort such transactions. An easy way would be to halt when that happens and recovery should take care of it.
At this point, there is no way to enforce storage side failures since these operations are memory only and a corrupted data would be caught by the parser. However, it is important that such failure scenarios get tested to avoid a crisis.