Loading...

XML

Word

Printable

Details

Type: Task
Resolution: Done
Priority: Major
Fix Version/s: 6.6.1
Affects Version/s: 6.6.0
Component/s: analytics
Labels:

Story Points:
1
Sprint:
CX Sprint 205, CX Sprint 206, CX Sprint 207, CX Sprint 208, CX Sprint 209

Description

Seeing something perhaps related on one of our upgrade tests today. This test does the following:

starts four alice nodes (two kv, two cbas)
ingests all beers
fails over both cbas nodes, upgrades them both to 6.6 simultaneously
ensures all beers are (still / again) ingested
failover & upgrade both kv nodes, one at a time
ensures all beers are (still / again) ingested <<<---- FAILS
At the last step, the number of beers is only 2942 instead of the expected 5891, throughout the entire period:

Expected result to be [ {

  "$1" : 5891

} ] but last result acquired = [ {

  "$1" : 2942

} ]

see failoverUpgradeAll.zip console.log

It looks like the upgrade in step 5 failed due to the following:

  "completionMessage": "Rebalance exited with reason {pre_rebalance_janitor_run_failed,\"beer-sample\",\n                                 {error,wait_for_memcached_failed,\n                                     ['ns_1@kv1.couchbase.host']}}."

I checked the Analytics logs and we keep failing to get the failover logs due to the kv nodes issue. We probably need to perform additional checks on the test and file an issue on ns_server/kv if we keep encountering this.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

Hide
failoverUpgradeAll.zip
30/Jun/20 8:35 AM
8.52 MB
Michael Blow
Extracting archive...
Show
failoverUpgradeAll.zip
30/Jun/20 8:35 AM
8.52 MB
Michael Blow
console.log
30/Jun/20 8:35 AM
34 kB
Michael Blow

Issue Links

Clones

MB-39955 [CX] intermittent partial ingestion? (feeds: connect-meta-filtered-buckets, index: nested-fields)

Closed

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Michael Blow

Reporter:: Michael Blow

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 30/Jun/20 8:35 AM

Updated:: 11/May/21 1:16 AM

Resolved:: 09/Dec/20 11:27 AM

Gerrit Reviews

There are no open Gerrit changes

Show There is 1 closed Gerrit change

Hide There is 1 closed Gerrit change

MB-40222: only failover analytics nodes in attempt to stabilize test: Gerrit Review:

[CX] intermittent failure UpgradeFromAlice600FailoverITD

Details

Description

Attachments

Attachments

Issue Links

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty