Details
-
Bug
-
Resolution: Fixed
-
Major
-
1.7.1
-
Security Level: Public
-
None
Description
the basic problem is that our expiry pager can cause a massive flood of deletes to the sqlite DB (7m in less than 1 sec). While that's going on, the write rate is basically 0 and so the disk queue blows up.
A few takeaways:
1) We need a stat on DB deletes (we already have reads/creates/updates) so that we can see when this is happening...it wasn't immediately obvious
2) We need an easy way to control the expiry pager...making it run more often in this case would have helped immensely
3) Is there anything "creative" we can do with sqlite today to make these faster? Perhaps a quick fix for 1.7.2?
4) What will the behavior of 2.0 be like? I suspect that with the append-only log format it should be at least faster to do the deletes, but we would still have the problem of less writes happening while the deletes are
5) What's the longer-term fix for this situation. Can we better interleave deletes with writes? We don't want to give full priority to writes since we'd never end up deleting anything...what about a separate dispatcher "thread" just for deletes?