State of vBucket on new master set to "replica" instead of "pending" during DCP takeover

Description

During vBcuket takeover, after the old master (producer) has sent all its mutations to the new master, it sends the first vBucket state change message to the new master. This message should set the state of the vBcuket on new master to "pending".

But looking at the ns_server.debug.log, it appears that the state is set to "replica" instead. This message logs the contents of the DCP packet sent by the
producer memcached.

Discussed with Manu and he will investigate further.

Please update fix and affect version as required.

Here are the relevant messages from the debug.log.

[ns_server:debug,2015-09-13T17:43:17.400-07:00,n_2@127.0.0.1:dcp_consumer_conn-default-n_0@10.0.0.29<0.1138.0>:dcp_commands:add_stream:66]Add stream for partition 767, opaque = 0x2FF, type = takeover

[ns_server:debug,2015-09-13T17:43:17.401-07:00,n_2@127.0.0.1:dcp_consumer_conn-default-n_0@10.0.0.29<0.1138.0>:dcp_proxy:handle_packet:115]Proxy packet: REQUEST: 0x53 (dcp_stream_req) vbucket = 767 opaque = 0x8000000
80 53 00 00$
...

[ns_server:debug,2015-09-13T17:43:17.401-07:00,n_2@127.0.0.1:<0.1139.0>:dcp_proxy:handle_packet:115]Proxy packet: RESPONSE: 0x53 (dcp_stream_req) vbucket = 0 opaque = 0x8000000 status = 0x0 (success)$
81 53 00 00$
00 00 00 00$
...

[ns_server:debug,2015-09-13T17:43:17.402-07:00,n_2@127.0.0.1:<0.1139.0>:dcp_producer_conn:handle_packet:33]?DCP_SET_VBUCKET_STATE: proxy Packet:<<128,91,0,0,1,0,2,255,0,0,0,1,8,0,0,0,0,0,0,0,0,0,0,0,2>>

[rebalance:debug,2015-09-13T17:43:17.402-07:00,n_2@127.0.0.1:dcp_consumer_conn-default-n_0@10.0.0.29<0.1138.0>:dcp_consumer_conn:handle_cast:284]Partition 767 is about to change status to replica <============== should be pending

[ns_server:debug,2015-09-13T17:43:17.402-07:00,n_2@127.0.0.1:dcp_consumer_conn-default-n_0@10.0.0.29<0.1138.0>:dcp_proxy:handle_packet:115]Proxy packet: RESPONSE: 0x5B (dcp_set_vbucket_state) vbucket = 0 opaque = 0x8000000 status = 0x0 (success)$
81 5B 00 00$
00 00 00 00$
00 00 00 00$
08 00 00 00$
00 00 00 00$
00 00 00 00$
$
[rebalance:debug,2015-09-13T17:43:17.402-07:00,n_2@127.0.0.1:dcp_consumer_conn-default-n_0@10.0.0.29<0.1138.0>:dcp_consumer_conn:handle_packet:133]Partition 767 changed status to replica <============== should be pending

[ns_server:debug,2015-09-13T17:43:17.403-07:00,n_2@127.0.0.1:<0.1139.0>:dcp_proxy:handle_packet:115]Proxy packet: REQUEST: 0x5B (dcp_set_vbucket_state) vbucket = 767 opaque = 0x8000000$
80 5B 00 00$
01 00 02 FF$
00 00 00 01$
08 00 00 00$
00 00 00 00$
00 00 00 00$
01 $
[ns_server:debug,2015-09-13T17:43:17.403-07:00,n_2@127.0.0.1:<0.1139.0>:dcp_producer_conn:handle_packet:33]?DCP_SET_VBUCKET_STATE: proxy Packet:<<128,91,0,0,1,0,2,255,0,0,0,1,8,0,0,0,0,$
0,0,0,0,0,0,0,1>> $
$
[rebalance:debug,2015-09-13T17:43:17.403-07:00,n_2@127.0.0.1:dcp_consumer_conn-default-n_0@10.0.0.29<0.1138.0>:dcp_consumer_conn:handle_cast:284]Partition 767 is about to change status to active$
[ns_server:debug,2015-09-13T17:43:17.403-07:00,n_2@127.0.0.1:dcp_consumer_conn-default-n_0@10.0.0.29<0.1138.0>:dcp_proxy:handle_packet:115]Proxy packet: RESPONSE: 0x5B (dcp_set_vbucket_state) vbucket = 0 opaque = 0x8000000 status = 0x0 (success)$
81 5B 00 00$
00 00 00 00$
00 00 00 00$
08 00 00 00$
00 00 00 00$
00 00 00 00$
$
[rebalance:debug,2015-09-13T17:43:17.403-07:00,n_2@127.0.0.1:dcp_consumer_conn-default-n_0@10.0.0.29<0.1138.0>:dcp_consumer_conn:handle_packet:133]Partition 767 changed status to active$

Components

Affects versions

Fix versions

Labels

Environment

None

Link to Log File, atop/blg, CBCollectInfo, Core dump

None

Release Notes Description

None

blocks

Activity

Eric Cooper March 9, 2016 at 8:12 PM

Reopening to clone to 4.1.1

Dave Finlay January 14, 2016 at 5:22 AM

We'll be reverting this fix as the SDKs are concerned about the fact that KV engine blocks when the vbucket is in pending state. After discussion between the SDKs and KV Engine the proposal is to change KV engine to return NMVB without a cluster config in pending state. This requires changes in memcached also (which is why we're reverting this fix and planning to fix in 4.1.1 where we have more time to make the changes.

Abhi Dangeti January 6, 2016 at 5:04 PM

During rebalance, for a small window of time, the vbucket state of the new master will be incorrectly set to replica instead of pending, which this change addresses.

Abhi Dangeti December 21, 2015 at 7:56 PM

Abhi Dangeti December 21, 2015 at 7:17 PM

I will back port this change for 3.1.4/4.1.1 releases.

Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Is this a Regression?

Unknown

Triage

Untriaged

Priority

Instabug

Open Instabug

PagerDuty

Sentry

Zendesk Support

Created September 16, 2015 at 12:13 AM
Updated March 9, 2016 at 8:34 PM
Resolved March 9, 2016 at 8:34 PM
Instabug