Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-46214

[Chronicle] Quorum failover fails with "error no_leader"

    XMLWordPrintable

Details

    Description

      Summary
      This might be a dup of https://issues.couchbase.com/browse/MB-45064 , but filing it anyways because I am not sure.

      Steps to Reproduce:
      1. Create a 4 node kv cluster - .215, .217, .219, .237

      +----------------+----------+-----------------------+--------------+--------------+
      | Nodes          | Services | Version               | CPU          | Status       |
      +----------------+----------+-----------------------+--------------+--------------+
      | 172.23.105.215 | kv       | 7.0.0-5127-enterprise | 3.4188034188 | Cluster node |
      | 172.23.105.217 | None     |                       |              | <--- IN ---  |
      | 172.23.105.219 | None     |                       |              | <--- IN ---  |
      | 172.23.106.237 | None     |                       |              | <--- IN ---  |
      +----------------+----------+-----------------------+--------------+--------------+

      2. Create a bucket with some collections and items

      +---------+-----------+----------+------------+-----+--------+------------+-----------+-----------+
      | Bucket  | Type      | Replicas | Durability | TTL | Items  | RAM Quota  | RAM Used  | Disk Used |
      +---------+-----------+----------+------------+-----+--------+------------+-----------+-----------+
      | default | couchbase | 3        | none       | 0   | 500000 | 8388608000 | 833984064 | 556314064 |
      +---------+-----------+----------+------------+-----+--------+------------+-----------+-----------+

      3. Introduce a symmetric partition in the cluster ie; into 2 equal mutually exclusive halves
      first half: .215, .217
      second half: ,219, .237
      4. Unsafe quorum failover the second half by making a rest request to orchestrator

      2021-05-10 02:38:27,967 | test  | ERROR   | pool-2-thread-29 | [rest_client:_http_request:748] POST http://172.23.105.215:8091/controller/failOver body: otpNode=ns_1%40172.23.105.219&otpNode=ns_1%40172.23.106.237&allowUnsafe=true headers: {'Accept': '*/*', 'Connection': 'close', 'Authorization': 'Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA==\n', 'Content-Type': 'application/x-www-form-urlencoded'} error: 500 reason: status: 500, content: Unexpected server error: no_leader Unexpected server error: no_leader auth: Administrator:password

      fails with "unexpected server error, no_leader"

      Observations
      on .215 error.log

      [ns_server:error,2021-05-10T02:38:27.960-07:00,ns_1@172.23.105.215:<0.4540.0>:chronicle_master:handle_call:184]Unsuccesfull quorum loss failover. (no_leader).

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            sumedh.basarkod Sumedh Basarkod (Inactive)
            sumedh.basarkod Sumedh Basarkod (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty