Details
-
Bug
-
Resolution: Won't Fix
-
Major
-
None
-
6.5.1, 6.6.0, 6.6.1, 6.5.0, Cheshire-Cat
-
None
-
Untriaged
-
1
-
No
Description
What's the issue?
We've seen a couple of cases where restores are "hanging" indefinitely with buckets with a very low residency ratio. I suspect this may be due to the fact that 'TEMP_OOM' errors are retried indefinitely (i.e. we'll never exhaust retries and return an error).
Is there a workaround?
Yes, restore to an adequately provisioned cluster.
What's the fix
We should account for 'TEMP_OOM' failures during a restore, note that they should be handled somewhat specially i.e. we should retry more times for 'TEMP_OOM' failures because in most scenarios data will be evicted allowing further progress.
Attachments
Issue Links
- relates to
-
MB-38686 [CBM] Add support for resuming restores
- Closed