[System Test]: Handlers hung in deploying state

Description

Build: 6.6.1-9213, passed on 6.6.1-9207

Test: Eventing component

Day: 3rd 

Cycle: 15

Test Step: rebalance failed and then its hung in deploying state. Which cause subsequent rebalance to be failed

Components

Affects versions

Fix versions

Labels

Environment

None

Link to Log File, atop/blg, CBCollectInfo, Core dump

None

Release Notes Description

None

Attachments

2

Activity

Show:

Vikas Chaudhary March 22, 2021 at 6:59 AM

Not seen on 7.0.0-4669

CB robot December 18, 2020 at 4:41 AM

Build couchbase-server-7.0.0-4073 contains eventing commit b036e1f with commit message:
: Lock the bucketmap before connecting to the bucket

Ankit Prabhu December 17, 2020 at 11:28 AM
Edited

Function can be deployed in 2 ways. From SettingsChangeCallback and TopologyChangeCallback. During rebalance in of a eventing node there can be a race between these 2 callbacks and 2 different function can watch bucket at a same time. Eventing will update the bucketmap with the 2nd function and 1st one won't be included in the bucketMap due to race between concurrent access to map. So when 1st function tries GetBucket it will fail and exit. But it will remain in bootstrapping list. Users can run into this issue when they have multiple functions against the same source bucket and they try to rebalance-in an eventing node.

Vikas Chaudhary December 17, 2020 at 9:26 AM
Edited

System recovered after killing all the producer on eventing nodes 1 by 1 and rebalance passed 

logs:  http://supportal.couchbase.com/snapshot/80d6eeb26726b35ea7abfb43a3070ecb::1

Subsequent rebalance and lifecycle operations passed too.

Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Is this a Regression?

Yes

Triage

Untriaged

Story Points

Priority

Instabug

Open Instabug

PagerDuty

Sentry

Zendesk Support

Created December 17, 2020 at 6:36 AM
Updated June 17, 2021 at 10:09 PM
Resolved December 18, 2020 at 3:53 AM
Instabug