FTS is broken with Alternate Addresses set

Description

Summary

FTS incorrectly uses the alternate addresses set for a node for DCP streams.

This means in the worst (and common) case where the alternate address is not reachable within the cluster (e.g. it's in Kubernetes and it's a Load Balancer for an individual pod, a fairly common deployment scenario), FTS is completely broken, while indexes can be created they are never built and cannot be searched.

In the best case where the cluster is not deployed in Kubernetes but is using Alternate Addresses for things like XDCR, the DCP traffic is being incorrectly routed out over the public internet rather than the private network as it should be.

This is a regression, FTS works fine in this environment on 6.6.x.

Steps to Reproduce

  1. Spin up the latest RC (I used Docker here):

  2. Setup the cluster with FTS enabled. Ensure that you set the hostname of the node to its IP and NOT 127.0.0.1 at this step. I also enabled node-to-node encryption, not sure if this is required for reproduction.

  3. Set an alternate address for your node, you can set this to any URL that will resolve but will not be accessible, I just used a Couchbase Cloud public hostname:

  4. Restart the cbft process

  5. Create an Index in the UI

  6. Try to search the Index

Expected Result

The index search completes successfully

Actual Result

Investigation

Logs show that FTS is trying to use the external address:

Note that 18.210.140.11 is what cb-0001.76aad0f6-de8a-46d8-9794-47df1b10f91f.dataplane.nonprod-project-avengers.com resolves to:

Also strangely it's trying to use 11210, but node-to-node encryption is enabled, which I would have expected meant that it would try to use encrypted ports.

Logs

https://cb-engineering.s3.amazonaws.com/MB-47457/collectinfo-2021-07-19T133501-ns_1%40172.17.0.2.zip

Components

Affects versions

Fix versions

Labels

Environment

None

Link to Log File, atop/blg, CBCollectInfo, Core dump

None

Release Notes Description

None

Attachments

1

Activity

Girish Benakappa January 19, 2022 at 7:48 AM

Verified with 7.1.0-2021. Closing the issue.

Girish Benakappa August 19, 2021 at 7:58 PM

Verified in 7.0.1 build 6102

Create/Update/Drop FTS indexes
Running queries from UI and REST
Flex queries from N1QL
Drop Buckets should cover up basic scenarios

CB robot August 12, 2021 at 8:26 PM

Build couchbase-server-7.0.1-6100 contains cbgt commit 5a75a31 with commit message:
: [BP] NetworkType:default for gocbcore Agent/DCPAgent configs

CB robot August 11, 2021 at 5:47 PM

Build couchbase-server-7.0.0-5306 contains cbgt commit 569ca44 with commit message:
: [BP] NetworkType:default for gocbcore Agent/DCPAgent configs

Arunkumar Senthilnathan July 31, 2021 at 8:32 PM

Verified in 7.0.1-5960 - following scenarios were tested on a 3 node CB cluster deployed on GKE with CAO 2.2.0:

Create/Update/Drop FTS indexes
Running queries from UI and REST
Flex queries from N1QL ()
Drop Buckets should cover up basic scenarios

Same scenarios will be executed on 7.1.0 as well

CC

Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Is this a Regression?

Yes

Triage

Untriaged

Story Points

Priority

Instabug

Open Instabug

PagerDuty

Sentry

Zendesk Support

Created July 16, 2021 at 11:25 AM
Updated January 19, 2022 at 7:48 AM
Resolved July 26, 2021 at 3:39 PM
Instabug