Details
-
Bug
-
Resolution: Duplicate
-
Critical
-
3.0.3
-
Security Level: Public
-
None
-
5
Description
In CBG-1878 we added logic to tell BLIP clients to disconnect and reconnect if the BLIP handlers panic and we see that the database has disappeared (because the panic is likely caused by the handlers trying to access a nil database).
It appears that, even after this case is hit, it's possible for go-blip to process further requests, which will also panic. In most cases this is benign - logs get some noise but nothing meaningful happens - but it's possible in some cases the panic will happen somewhere outside the BlipSyncContext, e.g. MultiChangesFeed, which would bubble up to base.FatalPanicHandler and bring down SGW.
It sounds like we need to do a few things here:
- When we send clients an ErrDatabaseWentAway, we need to promptly close the connection
- We should also tell go-blip to stop processing further requests (and specifically calling handlers) after we start the teardown