Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-46410

[Collections] - Hard failover fails with "Unexpected server error: {failover_failed,"default", Failed to get failover info for bucket \"default\""

    XMLWordPrintable

Details

    • Triaged
    • Centos 64-bit
    • 1
    • Yes

    Description

      Script to Repro

      guides/gradlew --refresh-dependencies testrunner -P jython=/opt/jython/bin/jython -P 'args=-i /tmp/win10-bucket-ops.ini rerun=False,get-cbcollect-info=True,quota_percent=99,crash_warning=True -t bucket_collections.collections_rebalance.CollectionsRebalance.test_data_load_collections_with_hard_failover_rebalance_out,nodes_init=5,nodes_failover=2,bucket_spec=multi_bucket.buckets_for_rebalance_tests_more_collections,data_load_spec=volume_test_load_with_CRUD_on_collections,data_load_stage=during,quota_percent=80,skip_validations=True,GROUP=failover_with_collection_crud'
      

      Steps to Repro
      1. Create a 5 node cluster
      2021-05-19 18:36:26,311 | test | INFO | pool-6-thread-6 | [table_view:display:72] Rebalance Overview
      -----------------------------------------------------------------------------

      Nodes Services Version CPU Status

      -----------------------------------------------------------------------------

      172.23.98.196 index, kv, n1ql 7.0.0-5190-enterprise 19.4500504541 Cluster node
      172.23.98.195 None     <--- IN —
      172.23.121.10 None     <--- IN —
      172.23.104.186 None     <--- IN —
      172.23.120.201 None     <--- IN —

      -----------------------------------------------------------------------------

      2. Create buckets/scopes/collections/data
      2021-05-19 18:40:58,237 | test | INFO | MainThread | [table_view:display:72] Bucket statistics
      --------------------------------------------------------------------------

      Bucket Type Replicas Durability TTL Items RAM Quota RAM Used Disk Used

      --------------------------------------------------------------------------

      bucket1 couchbase 3 none 0 3000 1048576000 206971928 344776386
      bucket2 ephemeral 3 none 0 3000 1048576000 319662728 170
      default couchbase 3 none 0 500000 10485760000 696018264 540211277

      --------------------------------------------------------------------------

      3. Hard failover 2 nodes

      2021-05-19 18:41:09,678 | test  | INFO    | MainThread | [collections_rebalance:rebalance_operation:388] Starting rebalance operation of type : hard_failover_rebalance_out
      2021-05-19 18:41:09,680 | test  | INFO    | MainThread | [collections_rebalance:rebalance_operation:632] failing over nodes [ip:172.23.104.186 port:8091 ssh_username:root, ip:172.23.120.201 port:8091 ssh_username:root]
      

      Failover fails as shown below.

      2021-05-19 18:41:31,765 | test  | ERROR   | pool-6-thread-3 | [rest_client:_http_request:748] POST http://172.23.98.196:8091/controller/failOver body: otpNode=ns_1%40172.23.104.186&allowUnsafe=false headers: {'Accept': '*/*', 'Connection': 'close', 'Authorization': 'Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA==\n', 'Content-Type': 'application/x-www-form-urlencoded'} error: 500 reason: status: 500, content: Unexpected server error: {failover_failed,"default",
                                   "Failed to get failover info for bucket \"default\": ['ns_1@172.23.120.201']"} Unexpected server error: {failover_failed,"default",
                                   "Failed to get failover info for bucket \"default\": ['ns_1@172.23.120.201']"} auth: Administrator:password
      2021-05-19 18:41:31,769 | test  | ERROR   | pool-6-thread-3 | [rest_client:fail_over:1276] ns_1@172.23.104.186 - Failover error: Unexpected server error: {failover_failed,"default",
                                   "Failed to get failover info for bucket \"default\": ['ns_1@172.23.120.201']"}
      

      cbcollect_info attached. This was not seen on the weekly run we had on 7.0.0-5161.

      Attachments

        1. screenshot-1.png
          screenshot-1.png
          45 kB
        2. screenshot-2.png
          screenshot-2.png
          41 kB
        3. screenshot-3.png
          screenshot-3.png
          42 kB
        4. screenshot-4.png
          screenshot-4.png
          45 kB
        5. test_1.zip
          14.90 MB
        6. test.log
          20 kB
        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            Balakumaran.Gopal Balakumaran Gopal
            Balakumaran.Gopal Balakumaran Gopal
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty