on_update_failure is non zero intermittently, in counter increment SBM test
Description
Components
Affects versions
Environment
7.0.0-5274
Link to Log File, atop/blg, CBCollectInfo, Core dump
None
Release Notes Description
None
Attachments
8
Activity
Show:
Vikas Chaudhary September 15, 2021 at 1:48 PM
not seen on 7.0.2-6644
CB robot August 2, 2021 at 2:28 PM
Build couchbase-server-7.0.1-5977 contains eventing commit 57aa814 with commit message:
: use op_timeout for lcb operations outer loop
Ritam Sharma July 16, 2021 at 11:32 AM
- - will be best to decide on this ticket.
Jeelan Poola July 12, 2021 at 5:49 AM
The fix for this issue needs to be backported to 7.0.1. Eventing is not retrying transient errors due to this bug. Fix is available in 7.1 already. Request inclusion. Thank you!
CB robot June 16, 2021 at 1:03 PM
Build couchbase-server-7.1.0-1024 contains eventing commit c7b51f4 with commit message:
: use op_timeout for lcb operations outer loop
Fixed
Pinned fields
Click on the next to a field label to start pinning.
Details
Assignee
Reporter
Prajwal Kiran Kumar
Prajwal Kiran KumarIs this a Regression?
Unknown
Triage
Untriaged
Story Points
1
Sprint
None
Priority
Critical
Instabug
Open Instabug
PagerDuty
PagerDuty Incident
PagerDuty Incident
Sentry
Linked Issues
Linked Issues
Zendesk Support
Linked Tickets
Linked Tickets
Created June 8, 2021 at 3:16 PM
Updated September 15, 2021 at 1:48 PM
Resolved August 2, 2021 at 4:58 AM
Instabug
We see on_update_failure intermittently on reruns of this particular test :
Functions executed/sec, 1 bucket x 100M x 1KB, 4KV + 1Eventing node, Counter Increment 0% Contention Source bkt mutation
We have seen this on 2 out of 3 reruns of the same test. So, it might be worth looking at
Failed runs :
http://perf.jenkins.couchbase.com/job/themis/10745/
http://perf.jenkins.couchbase.com/job/themis/10742/
Passed run :
http://perf.jenkins.couchbase.com/job/themis/10744/
Please note that the resident ratio on the source bucket is slightly less than 100% (at 95%) in these runs . We have not seen this behaviour in the 100% resident ratio test as of now.
on_update_failures : from the stats