Description
This can be seen in the UI e.g. after hitting Pause the Resume button is avalible in about 2 seconds.
This is not actually a UI issue but rather the composite_status via _p/event/api/v1/status goes to "pausing" but is set to "paused' when the underlying server is still cleaning up doing house keeping.
{
|
"apps": [
|
{
|
"composite_status": "paused",
|
"name": "ts111",
|
"num_bootstrapping_nodes": 0,
|
"num_deployed_nodes": 1,
|
"deployment_status": true,
|
"processing_status": false
|
}
|
],
|
"num_eventing_nodes": 1
|
}
|
Of concern is if either a UI, cli, or REST resume is given too fast (during the 10 second house keeping period) the Handler will have issues and final deployed state will take an excessive amount of time.
Example with http://review.couchbase.org/#/c/121533/ we can deploy and undeploy a handler in 16 sec. and 17 sec. respectively.
Pausing seems to take 2 seconds, but if we hit Resume button immediately (or use the CLI or REST to quickly) the deploy will take about 80 seconds. This is 5X longer than it should be.
If we wait for about 10 more seconds between the enable of the Resume button then the Resume action will only take just 16 seconds as expected.
I put some timing information in the attached "pause_resume_race.txt"
Attachments
Gerrit Reviews
For Gerrit Dashboard: MB-37823 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
122370,1 | After a Pause a Resume operation is allowed too early MB-37823 | unstable | eventing | Status: NEW | -1 | 0 |