Details
-
Improvement
-
Resolution: Unresolved
-
Major
-
None
-
None
-
None
-
1
Description
Capturing conversation w/ Jon regarding this requirement:
Jon wrote:
Hi Srini the following was a very good fix MB-51585 limiting the history of ownership history in the Eventing scratchpad. Looking at this got me thinking again about something that I think could be very helpful.
|
|
Right now for Eventing we have "Deploy From Now", "Deploy From Everything" it might be super beneficial for something like:
|
* "Deploy from at least 1 hour ago"
|
* "Deploy from at least 2 hours ago"
|
* "Deploy from at least 1 day ago"
|
* "Deploy from at least 2 days ago"
|
|
I Imaging we could do this with a rolling set of 3 checkpoints for hours and 3 checkpoints for days.
|
|
The benefit here is that we can start DCP up and skip days of data or years of data that would be ignored by a current "Deploy From Everything"
|
Srini wrote:
Hi Jon, I want to make sure I understand you suggestion well. Since you’re talking about a ’Deploy from at least <some period> ago”, the use-case is about a function that’s currently not deployed. There are two variations - the function was never deployed, or it was deployed, was active for a bit processing mutations, creating checkpoints etc., and then it got undeployed. Regardless of which variation we are talking about, there will not be any checkpoints corresponding to a function that’s currently not deployed. If it ever was deployed, then as part of undeployment, we nuke all metadata that we would have persisted along the way, including the checkpoints.
|
|
If the function was merely paused, we would have all the recent checkpoints. We could support a “Resume from x hours ago” variations in addition to the currently support ‘Resume from where we left off’ - is this what you mean?
|
|
By the way, I had a chat with Ankit to understand this a bit more, and here’s what he shared:
|
In the ownership history checkpoints, we used to store the entire history of vb ownership. These never had any seq-number information ever. For one customer (at least), the document grew to more than 20Mb, triggering an issue. As part of MB-51585, the initial direction was to limit the ownership history while checkpointing. However, as the work on that progressed, the decision was taken to not persist the ownership history at all - as that was never used, and did not add any value.
|
|
Coming to the topic of supporting a “Deploy from at least x hours ago” use-case, we could start a goroutine that will periodically (once an hour) capture the following information:
|
vb-id, vb-uuid, latest seqno on vb
|
|
This essintially will give us the seq-no at a per vb level for various points in time. We already listen to bucket lifecycle information, so we will be in a position to capture this for newly created buckets, stop monitoring deleted buckets etc. We could ensure we capture this information in a more granular fashion for the immediate past (hourly, for example), but retain only at a more coarse level for somewhat distant past. For example:
|
* For the past week, retain hourly
|
* For the past month (excepting past week), retain daily.
|
* For the past year (excepting past month), retain weekly or monthly etc.
|
|
With this information, we will be able to support the new deploy variations you enumerated above.
|