Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-48780

FTS does not work if N2n encryption is set to 'all'

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 7.0.1
    • Neo, 7.0.2
    • fts
    • None
    • Untriaged
    • 1
    • Unknown

    Description

      TL;DR

      When n2n encryption is set to 'all' FTS is deeming the certificates used by the cluster invalid as it's trying to use loopback to connect. This means it can never setup DCP streams and will never build any indexes.
      FTS is essentially non-functional in this setup, unsure if this only affects clusters where data is colocated with FTS.

      Steps to reproduce

      1. Create a single node cluster with FTS, I used docker. Important: Make sure you do not set the hostname as 127.0.0.1, I used the IP of the docker container and you enable node-to-node-encryption on setup screen:

        docker run -d --name 7.0.1 -p 8091-8097:8091-8097 couchbase:7.0.1
        

      2. Set the encryption level to 'all':

        /opt/couchbase/bin/couchbase-cli setting-security -c localhost -u Administrator -p password --set --cluster-encryption-level all
        

      3. Import travel-sample dataset
      4. Create an index on travel-sample
      5. View the index build progress

      Expected behavior

      • Index builds correctly
      • Index is searchable

      Actual behavior

      • Index never builds

      Investigation

      From logs (linked in comments), we can see that it's trying to connect to 127.0.0.1 for DCP:

      2021-10-06T20:58:03.375+00:00 [WARN] (GOCBCORE) Failed to connect to host. Get https://127.0.0.1:18091/pools/default/bs/travel-sample: x509: certificate is valid for 172.17.0.2, not 127.0.0.1 -- cbgt.GocbcoreLogger.Log() at gocbcore_utils.go:618
      2021-10-06T20:58:03.380+00:00 [WARN] feed_dcp_gocbcore: CreateDcpAgent, err: Get https://127.0.0.1:18091/pools/default/bs/travel-sample: x509: certificate is valid for 172.17.0.2, not 127.0.0.1 (close DCPAgent: 0xc0003a8400) -- cbgt.setupGocbcoreDCPAgent() at feed_dcp_gocbcore.go:368
      2021-10-06T20:58:03.380+00:00 [WARN] janitor: JanitorOnce, err: janitor: JanitorOnce errors: 1, []string{"#0: janitor: adding feed, err: feed_dcp_gocbcore: StartGocbcoreDCPFeed, could not prepare DCP feed, name: matt-test_6bddea98114e276e_4c1c5584, server: http://127.0.0.1:8091, bucketName: travel-sample, indexName: matt-test, err: newGocbcoreDCPFeed DCPAgent, err: feed_dcp_gocbcore: fetchAgent, setup err: agent setup failed, err: Get https://127.0.0.1:18091/pools/default/bs/travel-sample: x509: certificate is valid for 172.17.0.2, not 127.0.0.1"} -- cbgt.(*Manager).JanitorLoop() at manager_janitor.go:97
      

      Obviously, as 127.0.0.1 is not in the SANs of the certificate this fails.
      It's exceptionally uncommon to add localhost to any certificate, and is not even possible for third-party public CA issued certs (e.g. GoDaddy etc).
      I'd expect that FTS always tries to connect DCP over the configured hostnames in the cluster so that.

      Workaround

      If you are able to, you can add 127.0.0.1 to the SAN of the certificates of each of your nodes.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            Matt Carabine marking this as resolved for now. Let me know if you've any concerns on this.

            abhinav Abhinav Dangeti added a comment - Matt Carabine  marking this as resolved for now. Let me know if you've any concerns on this.

            Cheers Abhinav Dangeti I suspected it might have already been fixed but couldn't find it on a search (used the error message as search terms), so raised it anyway.

            matt.carabine Matt Carabine added a comment - Cheers Abhinav Dangeti I suspected it might have already been fixed but couldn't find it on a search (used the error message as search terms), so raised it anyway.

            Closing this issues as we already tested the fix in 7.0.2 and Neo using MB-47901

            girish.benakappa Girish Benakappa added a comment - Closing this issues as we already tested the fix in 7.0.2 and Neo using MB-47901

            People

              abhinav Abhinav Dangeti
              matt.carabine Matt Carabine
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty