Details
-
Bug
-
Resolution: User Error
-
Major
-
Cheshire-Cat
-
Untriaged
-
-
1
-
Yes
Description
Build : 7.0.0-2278
Steps to reproduce :
1. Initialize a 1-node cluster with kv, n1ql and index service, with ipv6 enabled.
2. Add another node to this cluster with kv, n1ql and index service.
3. Start rebalance.
Rebalance fails with the following error -
[ns_server:error,2020-06-09T10:07:10.054-07:00,ns_1@s10501-ip6.qe.couchbase.com:service_rebalancer-index-worker<0.3064.4>:service_agent:process_bad_results:862]Service call get_node_infos (service index) failed on some nodes:
|
[{'ns_1@s10501-ip6.qe.couchbase.com',
|
{exit,
|
{{linked_process_died,<0.3002.4>,{no_connection,"index-service_api"}},
|
{gen_server,call,
|
[{'service_agent-index','ns_1@s10501-ip6.qe.couchbase.com'},
|
{if_rebalance,<0.3054.4>,get_node_info},
|
infinity]}}}}]
|
[ns_server:error,2020-06-09T10:07:10.054-07:00,ns_1@s10501-ip6.qe.couchbase.com:cleanup_process<0.3053.4>:service_janitor:maybe_init_topology_aware_service:87]Initial rebalance for `index` failed: {error,
|
{initial_rebalance_failed,index,
|
{agent_died,<0.3001.4>,
|
{linked_process_died,<0.3002.4>,
|
{no_connection,
|
"index-service_api"}}}}}
|
The indexer logs is full of errors like the following -
2020-06-09T10:10:44.592-07:00 [Error] KVSender::closeMutationStream, MAINT_STREAM Error from Projector Post http://s10501-ip6.qe.couchbase.com:9999/adminport/shutdownTopicRequest: dial tcp 172.23.211.58:9999: connect: connection refused
|
2020-06-09T10:10:44.592-07:00 [Fatal] Indexer::closeAllStreams Stream MAINT_STREAM Projector health check needed, indexer can not proceed, Error received Post http://s10501-ip6.qe.couchbase.com:9999/adminport/shutdownTopicRequest: dial tcp 172.23.211.58:9999: connect: connection refused. Retrying (526).
|
2020-06-09T10:10:49.592-07:00 [Info] KVSender::sendShutdownTopic Projector s10501-ip6.qe.couchbase.com:9999 Topic MAINT_STREAM_TOPIC_fa659551e9c83565b61c9b506e3c98fe
|
2020-06-09T10:10:49.593-07:00 [Error] KVSender::sendShutdownTopic Unexpected Error During Shutdown Projector s10501-ip6.qe.couchbase.com:9999 Topic MAINT_STREAM_TOPIC_fa659551e9c83565b61c9b506e3c98fe. Err Post http://s10501-ip6.qe.couchbase.com:9999/adminport/shutdownTopicRequest: dial tcp 172.23.211.58:9999: connect: connection refused
|
2020-06-09T10:10:49.593-07:00 [Error] KVSender::closeMutationStream MAINT_STREAM Error Received Post http://s10501-ip6.qe.couchbase.com:9999/adminport/shutdownTopicRequest: dial tcp 172.23.211.58:9999: connect: connection refused from s10501-ip6.qe.couchbase.com:9999
|
This issue is seen with VMs that have IPv4 addresses as well as IPv6 ones. The above machines have the following output from ifconfig -
[root@s10501-ip6 logs]# ifconfig
|
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
|
inet 172.23.211.58 netmask 255.255.255.0 broadcast 172.23.211.255
|
inet6 fd63:6f75:6368:20d3:ac3c:257e:9c5:6619 prefixlen 64 scopeid 0x0<global>
|
inet6 fe80::dc11:da68:e01f:5368 prefixlen 64 scopeid 0x20<link>
|
ether 02:57:a0:50:0a:cb txqueuelen 1000 (Ethernet)
|
RX packets 208184507 bytes 91874002860 (85.5 GiB)
|
RX errors 0 dropped 0 overruns 0 frame 0
|
TX packets 284737800 bytes 230012848725 (214.2 GiB)
|
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
|
|
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
|
inet 127.0.0.1 netmask 255.0.0.0
|
inet6 ::1 prefixlen 128 scopeid 0x10<host>
|
loop txqueuelen 1 (Local Loopback)
|
RX packets 222097621 bytes 499733590235 (465.4 GiB)
|
RX errors 0 dropped 0 overruns 0 frame 0
|
TX packets 222097621 bytes 499733590235 (465.4 GiB)
|
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
|
|
[root@s10502-ip6 ~]# ifconfig
|
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
|
inet 172.23.211.43 netmask 255.255.255.0 broadcast 172.23.211.255
|
inet6 fd63:6f75:6368:20d3:d97a:7875:2e48:ea25 prefixlen 64 scopeid 0x0<global>
|
inet6 fe80::8b3f:d5df:572d:bbc4 prefixlen 64 scopeid 0x20<link>
|
ether fe:c0:e3:98:2b:d3 txqueuelen 1000 (Ethernet)
|
RX packets 10688808 bytes 10507260695 (9.7 GiB)
|
RX errors 0 dropped 0 overruns 0 frame 0
|
TX packets 5667521 bytes 3245147569 (3.0 GiB)
|
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
|
|
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
|
inet 127.0.0.1 netmask 255.0.0.0
|
inet6 ::1 prefixlen 128 scopeid 0x10<host>
|
loop txqueuelen 1 (Local Loopback)
|
RX packets 5453035 bytes 8576274848 (7.9 GiB)
|
RX errors 0 dropped 0 overruns 0 frame 0
|
TX packets 5453035 bytes 8576274848 (7.9 GiB)
|
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
|
But on machines where there is only ipv6 interface, this issue is not seen.
[root@s10510-ip6 tmp]# ifconfig
|
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
|
inet6 fd63:6f75:6368:20d4:d7f8:96be:407a:be32 prefixlen 64 scopeid 0x0<global>
|
inet6 2600:2109:1:d4:8654:af10:e32a:1e4f prefixlen 64 scopeid 0x0<global>
|
inet6 fd63:6f75:6368:20d4:1234::1 prefixlen 64 scopeid 0x0<global>
|
inet6 fe80::392:e5e9:4473:3f46 prefixlen 64 scopeid 0x20<link>
|
ether 3e:12:72:c2:7a:79 txqueuelen 1000 (Ethernet)
|
RX packets 10172173 bytes 10242758492 (9.5 GiB)
|
RX errors 0 dropped 0 overruns 0 frame 0
|
TX packets 6246975 bytes 4362772469 (4.0 GiB)
|
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
|
|
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
|
inet 127.0.0.1 netmask 255.0.0.0
|
inet6 ::1 prefixlen 128 scopeid 0x10<host>
|
loop txqueuelen 1 (Local Loopback)
|
RX packets 8721361 bytes 7492122535 (6.9 GiB)
|
RX errors 0 dropped 0 overruns 0 frame 0
|
TX packets 8721361 bytes 7492122535 (6.9 GiB)
|
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
|
The following commit could have led to this issue-
Commit: 8b5354fd0de9be3bf162c6e24cf4f5794bb20f05 in build: couchbase-server-7.0.0-2265
|
MB-31109: Make ip:port binding on projector and indexer more lenient<br />
|
If the cluster is configured to use ipv4, allow GSI processes
|
to come up successfully even if they cannot bind to ipv6:port.
|
|
Similarly, if the cluster is configured to use ipv6, allow GSI
|
processes to come up successfully even if they cannot bind to
|
ipv4:port.
|
|
Note that the GSI clients will use the node names from cluster
|
info cache. And the cluster configuration change with respect
|
to ipv4/ipv6 protocol stack is not supported if the cluser
|
node names are based on the ip addresses.
|
|
Change-Id: Iadb546e60ef32a3edd8ce3e5e41be8d1e721b443
|
Author: Amit Kulkarni <amit.kulkarni@couchbase.com>
|