Details
Description
Background:
During investigation of CBSE-967 (rebalance failing) we ultimately identified that one of the 6 nodes had an incorrect entry in /etc/hosts: node-03 had the address of node-01 as loopback (127.0.0.1). This appeared to ultimately be the cause of the failing rebalance - it succeeded after correcting this error.
During the discussion of the problem, Alk K commented that 2.5.0 should be able to detect this situation:
> > [DaveR]: If a (remote) cluster node's name resolves to loopback IP / to the same IP as itself, this should flag a warning state.
> [Alk K] That's not a bug. There are a number of environments where we want multiple nodes to be on same IP. Ideally we'd have some way to detect that we're talking to wrong memcached instance. And in fact 2.5.0-s per-node memcached password should actually provide exactly that.
However it appears we didn't detect the problem in this situation.
Steps to reproduce:
1. Create node1 with a correct /etc/hosts. Install and init cluster on this node Add a bucket and some data (maybe example bucket). Example of correct /etc/hosts:
10.0.0.1 node1 node1.local
10.0.0.2 node2 node2.local
2. Create node2, give it a /etc/hosts entry for node1 to loopback:
127.0.0.1 node1 node1.local
10.0.0.2 node2 node2.local
3. Add node2 to the cluster (from node1's UI) as "node2.local".
4. Attempt rebalance. It will fail.
Logs attached from my basic reproducer.