Loading...

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: 7.1.0
Affects Version/s: 7.0.2
Component/s: ns_server
Labels:
- system_test_upgrade
- upgrade
Environment:
6.6.3-9808 -> 7.0.2-6668

Triage:
Triaged
Story Points:
1
Is this a Regression?:
No

Description

Steps to Repro
1. Run the following longevity script on 6.6.3 for 5 days.

./sequoia -client 172.23.104.254:2375 -provider file:centos_second_cluster.yml -test tests/integration/test_allFeatures_madhatter_durability.yml -scope tests/integration/scope_Xattrs_Madhatter.yml -scale 3 -repeat 0 -log_level 0 -version 6.6.3-9808 -skip_setup=true -skip_test=false -skip_teardown=true -skip_cleanup=false -continue=false -collect_on_error=false -stop_on_error=false -duration=604800 -show_topology=true

At this point it should have a 27 node cluster ( 9 Kv, 6 Index, 3 analytics, 3 fts, 3 eventing and 3 n1ql)
2. Create 10k metakv tombstones. This has been part of our testing since ~~MB-44838~~ was fixed. We used to have a total of around 25k for CC, have reduced it here to around 12k.

 #!/bin/sh

for i in {0..10000}

do

        `curl -X PUT -u Administrator:password http://localhost:8091/_metakv/key{$i} -d 'value=foo1'`

        `curl -X DELETE -v -u Administrator:password http://localhost:8091/_metakv/key{$i}`

    done

3. Swap rebalance 6 nodes , 1 of each service with that of 7.0.2 nodes. Rebalance goes through successfully.
4. Failover 6 nodes(6.6.3 nodes)1 of each service(kv is graceful failover), Upgrade these nodes to 7.0.2, do a recovery of all the 6 node(kv is delta recovery) and rebalance.
5. Repeat step no 4 until all the nodes in cluster are upgraded to 7.0.2.

After upgrade I validated that all the metakv tombstones are purged and enabled IPv4 only + enforce tls using the below commands.

[root@localhost logs]# grep 'ns_config tombstone' debug.log

[ns_server:debug,2021-09-14T04:26:34.722-07:00,ns_1@172.23.106.134:tombstone_agent<0.984.0>:tombstone_agent:purge:195]Purged 11869 ns_config tombstone(s) up to timestamp 63798837690. Tombstones:

[ns_server:debug,2021-09-14T04:30:38.296-07:00,ns_1@172.23.106.134:tombstone_agent<0.984.0>:tombstone_agent:purge:195]Purged 1 ns_config tombstone(s) up to timestamp 63798837934. Tombstones:

[ns_server:debug,2021-09-14T04:31:41.731-07:00,ns_1@172.23.106.134:tombstone_agent<0.984.0>:tombstone_agent:purge:195]Purged 127 ns_config tombstone(s) up to timestamp 63798837998. Tombstones:

[ns_server:debug,2021-09-14T04:39:42.419-07:00,ns_1@172.23.106.134:tombstone_agent<0.984.0>:tombstone_agent:purge:195]Purged 1 ns_config tombstone(s) up to timestamp 63798838481. Tombstones:

[ns_server:debug,2021-09-14T04:40:43.040-07:00,ns_1@172.23.106.134:tombstone_agent<0.984.0>:tombstone_agent:purge:195]Purged 1 ns_config tombstone(s) up to timestamp 63798838542. Tombstones:

[root@localhost logs]#

[root@localhost logs]#  /opt/couchbase/bin/couchbase-cli ip-family -c http://localhost:8091 -u Administrator -p password --set --ipv4only

Switched IP family for node: http://172.23.106.134:8091

Switched IP family for node: http://172.23.106.136:8091

Switched IP family for node: http://172.23.106.137:8091

Switched IP family for node: http://172.23.106.138:8091

Switched IP family for node: http://172.23.120.58:8091

Switched IP family for node: http://172.23.120.73:8091

Switched IP family for node: http://172.23.120.74:8091

Switched IP family for node: http://172.23.120.75:8091

Switched IP family for node: http://172.23.120.77:8091

Switched IP family for node: http://172.23.120.81:8091

Switched IP family for node: http://172.23.120.86:8091

Switched IP family for node: http://172.23.121.118:8091

Switched IP family for node: http://172.23.121.77:8091

Switched IP family for node: http://172.23.123.24:8091

Switched IP family for node: http://172.23.123.25:8091

Switched IP family for node: http://172.23.123.26:8091

Switched IP family for node: http://172.23.123.31:8091

Switched IP family for node: http://172.23.123.32:8091

Switched IP family for node: http://172.23.123.33:8091

Switched IP family for node: http://172.23.96.122:8091

Switched IP family for node: http://172.23.96.14:8091

Switched IP family for node: http://172.23.96.243:8091

Switched IP family for node: http://172.23.97.105:8091

Switched IP family for node: http://172.23.97.148:8091

Switched IP family for node: http://172.23.97.149:8091

Switched IP family for node: http://172.23.97.150:8091

Switched IP family for node: http://172.23.97.151:8091

SUCCESS: Switched IP family of the cluster

[root@localhost logs]# /opt/couchbase/bin/couchbase-cli node-to-node-encryption -c http://localhost:8091 -u Administrator -p password --enable

Turned on encryption for node: http://172.23.106.134:8091

Turned on encryption for node: http://172.23.106.136:8091

Turned on encryption for node: http://172.23.106.137:8091

Turned on encryption for node: http://172.23.106.138:8091

Turned on encryption for node: http://172.23.120.58:8091

Turned on encryption for node: http://172.23.120.73:8091

Turned on encryption for node: http://172.23.120.74:8091

Turned on encryption for node: http://172.23.120.75:8091

Turned on encryption for node: http://172.23.120.77:8091

Turned on encryption for node: http://172.23.120.81:8091

Turned on encryption for node: http://172.23.120.86:8091

Turned on encryption for node: http://172.23.121.118:8091

Turned on encryption for node: http://172.23.121.77:8091

Turned on encryption for node: http://172.23.123.24:8091

Turned on encryption for node: http://172.23.123.25:8091

Turned on encryption for node: http://172.23.123.26:8091

Turned on encryption for node: http://172.23.123.31:8091

Turned on encryption for node: http://172.23.123.32:8091

Turned on encryption for node: http://172.23.123.33:8091

Turned on encryption for node: http://172.23.96.122:8091

Turned on encryption for node: http://172.23.96.14:8091

Turned on encryption for node: http://172.23.96.243:8091

Turned on encryption for node: http://172.23.97.105:8091

Turned on encryption for node: http://172.23.97.148:8091

Turned on encryption for node: http://172.23.97.149:8091

Turned on encryption for node: http://172.23.97.150:8091

Turned on encryption for node: http://172.23.97.151:8091

SUCCESS: Switched node-to-node encryption on

[root@localhost logs]#  /opt/couchbase/bin/couchbase-cli setting-security -c http://localhost:8091 -u Administrator -p password --set --cluster-encryption-level strict

SUCCESS: Security settings updated

[root@localhost logs]#

At this point I noticed that Rebalance button was enabled. There was nothing to rebalance afaik. I did rebalance , It failed (with some nodes down error, possibly from setting IPV4 only and enforce tls when we restart services) and next time failed with eventing hang(that is tracked by ~~MB-48449~~). This bug is to figure out why rebalance button was enabled in the first place. I don't particularly remember if the rebalance button was enabled after upgrade but before the enablement of IPV4-only and enfore-tls.

cbcollect_info attached. This the first time we are running this system test upgrade on 7.0.2, hence there is no baseline as such and no last working build.

[System Test] - Rebalance button is enabled post upgrade from 6.6.3 -> 7.0.2

Details

Description

Attachments

Activity

People

Dates

PagerDuty