Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-47457

FTS is broken with Alternate Addresses set

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • Yes

    Description

      Summary

      FTS incorrectly uses the alternate addresses set for a node for DCP streams.

      This means in the worst (and common) case where the alternate address is not reachable within the cluster (e.g. it's in Kubernetes and it's a Load Balancer for an individual pod, a fairly common deployment scenario), FTS is completely broken, while indexes can be created they are never built and cannot be searched.

      In the best case where the cluster is not deployed in Kubernetes but is using Alternate Addresses for things like XDCR, the DCP traffic is being incorrectly routed out over the public internet rather than the private network as it should be.

      This is a regression, FTS works fine in this environment on 6.6.x.

      Steps to Reproduce

      1. Spin up the latest RC (I used Docker here):

        docker run -d --name cc-rc -p 8091-8097:8091-8097 registry.gitlab.com/cb-vanilla/server:7.0.0-5302
        

      2. Setup the cluster with FTS enabled. Ensure that you set the hostname of the node to its IP and NOT 127.0.0.1 at this step. I also enabled node-to-node encryption, not sure if this is required for reproduction.
      3. Set an alternate address for your node, you can set this to any URL that will resolve but will not be accessible, I just used a Couchbase Cloud public hostname:

        /opt/couchbase/bin/couchbase-cli setting-alternate-address --cluster localhost --username Administrator --password password --set --hostname cb-0001.76aad0f6-de8a-46d8-9794-47df1b10f91f.dataplane.nonprod-project-avengers.com --node 172.17.0.2
        

      4. Restart the cbft process
      5. Create an Index in the UI
      6. Try to search the Index

      Expected Result

      The index search completes successfully

      Actual Result

      Investigation

      Logs show that FTS is trying to use the external address:

      2021-07-16T11:10:57.457+00:00 [INFO] (GOCBCORE) Creating new agent: &{MemdAddrs:[] HTTPAddrs:[127.0.0.1:8091] BucketName:test UserAgent:matt_7adf0964fce22708_4c1c5584 UseTLS:false NetworkType: Auth:0x1d368d0 TLSRootCAProvider:<nil> UseMutationTokens:false UseCompression:false UseDurations:false DisableDecompression:false UseOutOfOrderResponses:false DisableXErrors:false DisableJSONHello:false DisableSyncReplicationHello:false UseCollections:true CompressionMinSize:0 CompressionMinRatio:0 HTTPRedialPeriod:0s HTTPRetryDelay:0s HTTPMaxWait:0s CccpMaxWait:0s CccpPollPeriod:0s ConnectTimeout:1m0s KVConnectTimeout:7s KvPoolSize:0 MaxQueueSize:0 HTTPMaxIdleConns:0 HTTPMaxIdleConnsPerHost:0 HTTPIdleConnectionTimeout:0s Tracer:<nil> NoRootTraceSpans:false DefaultRetryStrategy:<nil> CircuitBreakerConfig:{Enabled:false VolumeThreshold:0 ErrorThresholdPercentage:0 SleepWindow:0s RollingWindow:0s CompletionCallback:<nil> CanaryTimeout:0s} UseZombieLogger:false ZombieLoggerInterval:0s ZombieLoggerSampleSize:0 AuthMechanisms:[]}
       
      2021-07-16T11:11:02.517+00:00 [WARN] (GOCBCORE) Pipeline Client 0xc00069ee40 failed to bootstrap: dial tcp 18.210.140.11:11210: i/o timeout -- cbgt.GocbcoreLogger.Log() at gocbcore_utils.go:617
      

      Note that 18.210.140.11 is what cb-0001.76aad0f6-de8a-46d8-9794-47df1b10f91f.dataplane.nonprod-project-avengers.com resolves to:

      nslookup cb-0001.76aad0f6-de8a-46d8-9794-47df1b10f91f.dataplane.nonprod-project-avengers.com
      Server:         8.8.8.8
      Address:        8.8.8.8#53
       
      Non-authoritative answer:
      Name:   cb-0001.76aad0f6-de8a-46d8-9794-47df1b10f91f.dataplane.nonprod-project-avengers.com
      Address: 18.210.140.11
      

      Also strangely it's trying to use 11210, but node-to-node encryption is enabled, which I would have expected meant that it would try to use encrypted ports.

      Logs

      https://cb-engineering.s3.amazonaws.com/MB-47457/collectinfo-2021-07-19T133501-ns_1%40172.17.0.2.zip

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-47457
          # Subject Branch Project Status CR V

          Activity

            People

              girish.benakappa Girish Benakappa
              matt.carabine Matt Carabine (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty