I've had a look into this and believe I know what's happening, I'm able to reproduce the issue under artificial circumstances (modifying the code); if you are in fact hitting what I think you're hitting, this has already been fixed in master and should only possible under pretty rare circumstances. Below is what I believe is happening:
1) Start restoring a small amount of data
1a) The worker pool is started (in your case with a single worker; this creates an buffered error channel with room for a single error)
2) The archive begins reading data from disk
3) We fire of all the mutations for the restore (in this case two)
4) We check the (archive) error stream; it's empty (because we read everything from disk successfully)
5) We begin the teardown process, informing the worker pool it should complete the rest of its work and exit cleanly
6) The worker pool begins teardown
6a) It handles the first mutation, we hit an error; it gets put in the errors stream
6b) It handles the second mutation, we hit another error; we block indefinitely attempting to put the error in the error channel
To summarize, I believe this is a race condition that should only occur when the archive source has finished sending all of its mutations (without error) and begins waiting for the worker pool to finish, at which point it hits enough errors from the final mutations to fill up the error channel before exiting. This is due to the fact that once we begin waiting, there aren't any threads reading from the error channel.
This has already been fixed in master because I did some work on the error propagation in CC where the buffered channel was changed to fit enough errors for every in-flight mutation to fail (one of these errors would then be propagated to the user once we'd finished teardown).