Details
-
Bug
-
Resolution: Fixed
-
Critical
-
3.1.6, 4.0.0, 4.1.2, 4.5.0, 4.5.1
-
None
-
Untriaged
-
Unknown
Description
1. Apply the attached patch to ep-engine (scheduling may be arbitrary, so I can introduce arbitrary delays in the code and it still should behave correctly).
2. Create a bucket upload some data, restart the server couple of times to get some failover history.
$ ~/dev/membase/repo-watson/install/bin/cbstats 127.0.0.1:12000 failovers 7
|
vb_7:0:id: 215814349521305
|
vb_7:0:seq: 160
|
vb_7:1:id: 155329856619529
|
vb_7:1:seq: 160
|
vb_7:2:id: 91747243693536
|
vb_7:2:seq: 160
|
vb_7:3:id: 128962012805783
|
vb_7:3:seq: 160
|
vb_7:4:id: 275970465686046
|
vb_7:4:seq: 160
|
vb_7:5:id: 241099843010628
|
vb_7:5:seq: 160
|
vb_7:6:id: 50511930730683
|
vb_7:6:seq: 160
|
vb_7:7:id: 280675982653774
|
vb_7:7:seq: 0
|
vb_7:num_entries: 8
|
3. Run the following via /diag/eval.
V = 7, ns_memcached:set_vbucket("default", V, dead), ns_memcached:sync_delete_vbucket("default", V). |
This is what happens during bucket flush for all vbuckets.
4. Let ns_server recreate the vbucket, wait for 10 second sleep to expire.
You can watch the local docs to see when it happens.
$ ../install/bin/couch_dbdump --local --json data/n_0/data/default/7.couch.1
|
Dumping "data/n_0/data/default/7.couch.1":
|
{"id":"_local/vbstate",
|
"value":"{\"state\": \"active\",
|
\"checkpoint_id\": \"0\",
|
\"max_deleted_seqno\": \"0\",
|
\"failover_table\": [{\"id\":199363149644834,\"seq\":0}],
|
\"snap_start\": \"0\",\"snap_end\": \"0\",
|
\"max_cas\": \"0\",
|
\"drift_counter\": \"-140737488355328\"}"}
|
|
Total docs: 1
|
And then vbucket goes back to dead state.
$ ../install/bin/couch_dbdump --local --json data/n_0/data/default/7.couch.1
|
Dumping "data/n_0/data/default/7.couch.1":
|
{"id":"_local/vbstate",
|
"value":"{\"state\": \"dead\",
|
\"checkpoint_id\": \"0\",
|
\"max_deleted_seqno\": \"0\",
|
\"failover_table\": [{\"id\":215814349521305,\"seq\":160},
|
{\"id\":155329856619529,\"seq\":160},
|
{\"id\":91747243693536,\"seq\":160},
|
{\"id\":128962012805783,\"seq\":160},
|
{\"id\":275970465686046,\"seq\":160},
|
{\"id\":241099843010628,\"seq\":160},
|
{\"id\":50511930730683,\"seq\":160},
|
{\"id\":280675982653774,\"seq\":0}],
|
\"snap_start\": \"160\",\"snap_end\": \"160\",
|
\"max_cas\": \"1473396979655770112\",
|
\"drift_counter\": \"-140737488355328\"}"}
|
|
Total docs: 1
|
5. At this point stats show fresh failover history. Restart the server, observe that failover history from the deleted vbucket resurrects.
$ ~/dev/membase/repo-watson/install/bin/cbstats 127.0.0.1:12000 failovers 7
|
vb_7:0:id: 215814349521305
|
vb_7:0:seq: 160
|
vb_7:1:id: 155329856619529
|
vb_7:1:seq: 160
|
vb_7:2:id: 91747243693536
|
vb_7:2:seq: 160
|
vb_7:3:id: 128962012805783
|
vb_7:3:seq: 160
|
vb_7:4:id: 275970465686046
|
vb_7:4:seq: 160
|
vb_7:5:id: 241099843010628
|
vb_7:5:seq: 160
|
vb_7:6:id: 50511930730683
|
vb_7:6:seq: 160
|
vb_7:7:id: 280675982653774
|
vb_7:7:seq: 0
|
vb_7:num_entries: 8
|