Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-62644

Rebalance failure - backup service error 'transport: authentication handshake failed: tls: failed to verify certificate: x509: certificate signed by unknown authority'

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • Morpheus
    • 7.2.2
    • ns_server
    • None
    • Couchbase Server 7.2.2

    Description

      Steps followed:

      1. Brought up nodes and initialised node1 only (with KV and backup service).
      2. Loaded certificates onto the nodes. ca.pem and int*.pem were loaded into trust store for respective nodes (e.g. int1.pem loaded to node1 only). couchbase.node*.svc.pem loaded as node cert onto respective node.
      3. From the Web UI for node1, added node2 (selecting KV and backup again), then initiated a rebalance.
      4. From the Web UI for node1, added node3 (selecting KV and backup again), then initiated a rebalance.

       

      Behaviour seen:

      Node addition was allowed both times, but the rebalances both failed with the following errors:

      Rebalance exited with reason {service_rebalance_failed,backup,
                                    {worker_died,
                                     {'EXIT',<0.32179.6>,
                                      {rebalance_failed,
                                       {service_error,
                                        <<"could not add node 'fe95c4b084438690778a8785d10bf2ad': exhausted retry count after 5 attempts: rpc error: code = Unavailable desc = connection error: desc = \"transport: authentication handshake failed: tls: failed to verify certificate: x509: certificate signed by unknown authority\"">>}}}}}. 

       

      Certificates:

      The certificates are designed to align with a customers setup where the same error was seen during a rebalance. A single root certificate was used, but each of the intermediate certificates is different (however int1.pem and int2.pem were created using the same private key as I wanted to see if this would make any difference). Each of the node certificates has a suitable DNS SAN.

       

      As certificates are lost when node addition is performed, I've attached them to this ticket.

       

      Question:

      When node3 is added (as an example), this was allowed even though the node cert is not issued by the certificates that exist in the clusters trust store. This means that once node3 has been added, its certificate is not issued by any of the certificates in the trust store.

      • Are the certificates actually the cause of the rebalance failure (as I assumed)?
      • Is it intended that these node additions are possible? Shouldn't we prevent node addition in cases where it leaves incomplete chains of trust?

       

      https://supportal.couchbase.com/snapshot/fcca479d774f82dec488e47fb6550d1a::0

      s3://cb-customers-secure/deacon/2024-07-09/node1_cbcollect-6cfa72cae055b8ad.zip
      s3://cb-customers-secure/deacon/2024-07-09/node3_cbcollect-84e75a71620a0f69.zip
      s3://cb-customers-secure/deacon/2024-07-09/node2_cbcollect-6e3cae9ede39b17c.zip

      See here for details

      Attachments

        1. ca.pem
          1 kB
        2. couchbase.node1.svc.pem
          1 kB
        3. couchbase.node2.svc.pem
          1 kB
        4. couchbase.node3.svc.pem
          1 kB
        5. int1.pem
          1 kB
        6. int2.pem
          1 kB
        7. int3.pem
          1 kB
        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            Abhijeeth.Nuthan Abhijeeth Nuthan
            deacon.linkhorn Deacon Linkhorn
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty