Details
-
Bug
-
Resolution: Fixed
-
Major
-
7.1.0, Cheshire-Cat
-
7.0.0-5190-enterprise
-
Triaged
-
Centos 64-bit
-
1
-
Yes
Description
Script to Repro
guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/win10-bucket-ops.ini rerun=False,get-cbcollect-info=True,quota_percent=99,crash_warning=True -t bucket_collections.collections_rebalance.CollectionsRebalance.test_data_load_collections_with_hard_failover_rebalance_out,nodes_init=5,nodes_failover=2,bucket_spec=multi_bucket.buckets_for_rebalance_tests_more_collections,data_load_spec=volume_test_load_with_CRUD_on_collections,data_load_stage=during,quota_percent=80,skip_validations=True,GROUP=failover_with_collection_crud'
|
Steps to Repro
1. Create a 5 node cluster
2021-05-19 18:36:26,311 | test | INFO | pool-6-thread-6 | [table_view:display:72] Rebalance Overview
-----------------------------------------------------------------------------
Nodes | Services | Version | CPU | Status |
-----------------------------------------------------------------------------
172.23.98.196 | index, kv, n1ql | 7.0.0-5190-enterprise | 19.4500504541 | Cluster node |
172.23.98.195 | None | <--- IN — | ||
172.23.121.10 | None | <--- IN — | ||
172.23.104.186 | None | <--- IN — | ||
172.23.120.201 | None | <--- IN — |
-----------------------------------------------------------------------------
2. Create buckets/scopes/collections/data
2021-05-19 18:40:58,237 | test | INFO | MainThread | [table_view:display:72] Bucket statistics
--------------------------------------------------------------------------
Bucket | Type | Replicas | Durability | TTL | Items | RAM Quota | RAM Used | Disk Used |
--------------------------------------------------------------------------
bucket1 | couchbase | 3 | none | 0 | 3000 | 1048576000 | 206971928 | 344776386 |
bucket2 | ephemeral | 3 | none | 0 | 3000 | 1048576000 | 319662728 | 170 |
default | couchbase | 3 | none | 0 | 500000 | 10485760000 | 696018264 | 540211277 |
--------------------------------------------------------------------------
3. Hard failover 2 nodes
2021-05-19 18:41:09,678 | test | INFO | MainThread | [collections_rebalance:rebalance_operation:388] Starting rebalance operation of type : hard_failover_rebalance_out
|
2021-05-19 18:41:09,680 | test | INFO | MainThread | [collections_rebalance:rebalance_operation:632] failing over nodes [ip:172.23.104.186 port:8091 ssh_username:root, ip:172.23.120.201 port:8091 ssh_username:root]
|
Failover fails as shown below.
2021-05-19 18:41:31,765 | test | ERROR | pool-6-thread-3 | [rest_client:_http_request:748] POST http://172.23.98.196:8091/controller/failOver body: otpNode=ns_1%40172.23.104.186&allowUnsafe=false headers: {'Accept': '*/*', 'Connection': 'close', 'Authorization': 'Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA==\n', 'Content-Type': 'application/x-www-form-urlencoded'} error: 500 reason: status: 500, content: Unexpected server error: {failover_failed,"default",
|
"Failed to get failover info for bucket \"default\": ['ns_1@172.23.120.201']"} Unexpected server error: {failover_failed,"default",
|
"Failed to get failover info for bucket \"default\": ['ns_1@172.23.120.201']"} auth: Administrator:password
|
2021-05-19 18:41:31,769 | test | ERROR | pool-6-thread-3 | [rest_client:fail_over:1276] ns_1@172.23.104.186 - Failover error: Unexpected server error: {failover_failed,"default",
|
"Failed to get failover info for bucket \"default\": ['ns_1@172.23.120.201']"}
|
cbcollect_info attached. This was not seen on the weekly run we had on 7.0.0-5161.