Details
-
Bug
-
Resolution: Fixed
-
Critical
-
5.5.0
-
Enterprise Edition 5.1.0 build 1477
4 node cluster : kv-eventing-index-n1ql
-
Untriaged
-
Centos 64-bit
-
-
No
Description
Steps to Repro:
===========
1) Deployed the following handler code
function OnUpdate(doc, meta) {
|
var doc_id = meta.id;
|
log('creating document for : ', doc); |
dst_bucket[doc_id] = {'doc_id' : doc_id}; // SET operation |
}
|
function OnDelete(meta) {
|
log('deleting document', meta.id); |
delete dst_bucket[meta.id]; // DELETE operation |
}
|
2) created 4032 docs to the source bucket.
3) When eventing is writing docs to dst bucket, killed eventing consumers using the following command
killall -9 eventing-consumer
|
Destination bucket only had 1,977 docs.
Abhishek Singh did a initial triage and concluded that whenever eventing-consumer is killed, eventing producer should close existing DCP stream, read the checkpoint state from metadata bucket and restart DCP streams.
Logs attached. Also pasting some useful logs used by Abhishek Singh to debug the issue.
Balakumarans-MacBook-Pro:testrunner balakumaran.g$ curl http://Administrator:password@10.112.170.101:8092/metadata/_design/dev_d/_view/v?stale=false -s | jq ".rows[].value[0]" | awk '{sum+=$1} END {print sum}'
|
1391
|
[root@node2-cb500-centos7 ~]# ps aux | grep kvport
|
couchba+ 12409 13.8 5.3 191056 54212 ? Sl 08:49 1:47 /opt/couchbase/bin/eventing-producer -adminport=8096 -dir=/opt/couchbase/var/lib/couchbase/data/@eventing -kvport=11210 -restport=8091 -uuid=7448eef313e087076808908126eeae91 -adminsslport=18096 -certfile=/opt/couchbase/var/lib/couchbase/config/memcached-cert.pem -keyfile=/opt/couchbase/var/lib/couchbase/config/memcached-key.pem
|
root 12542 0.0 0.0 112640 960 pts/0 R+ 09:02 0:00 grep --color=auto kvport
|
[root@node2-cb500-centos7 ~]# kill -9 12409
|
Balakumarans-MacBook-Pro:testrunner balakumaran.g$ curl http://Administrator:password@10.112.170.101:8092/metadata/_design/dev_d/_view/v?stale=false -s | jq ".rows[].value[0]" | awk '{sum+=$1} END {print sum}'
|
4032
|
Balakumarans-MacBook-Pro:testrunner balakumaran.g$
|
You could use the following test to validated the issue once the bug is fixed.
./testrunner -i b/temp_centos7.ini -t eventing.eventing_recovery.EventingRecovery.test_killing_eventing_consumer_when_eventing_is_processing_mutations,nodes_init=4,services_init=kv-eventing-index-n1ql,dataset=default,groups=simple,reset_services=True,skip_cleanup=True,doc-per-day=2
|
Attachments
For Gerrit Dashboard: MB-27071 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
92166,2 | MB-27071 Stop and spawn Eventing.Consumer if eventing-consumer exits | unstable | eventing | Status: ABANDONED | 0 | 0 |
93162,8 | MB-27071 Exit & respawn Eventing.Consumer upon exit of V8 worker | unstable | eventing | Status: MERGED | +2 | +1 |