Details
Description
This issue basically affects Eventing rebalance as Eventing uses GoCB on server side. We create multiple GoCB handle per every deployed application and during rebalance we noticed because of GoCB using many FDs, Eventing nodes aren't left with any ephemeral ports for internode communication as part of Eventing rebalance and as a result Eventing rebalance fails as captured in MB-30323.
Problem can be replicated outside of Eventing as well. Sample program - https://gist.github.com/abhi-bit/c1ed794979650f851ee4829ea7f0ed37
Program just creates one gocb handle and leverages it across multiple go-routines. Attached lsof dump. From the dump, 256 sockets are active with KV nodes in the cluster.
If I compare this with libcouchbase, lcb creates only 4 sockets on the same setup.
a.out 188218 root cwd DIR 253,0 4096 537358849 /root
|
a.out 188218 root rtd DIR 253,0 280 64 /
|
a.out 188218 root txt REG 253,0 14200 538254848 /root/a.out
|
a.out 188218 root mem REG 253,0 62184 8098 /usr/lib64/libnss_files-2.17.so
|
a.out 188218 root mem REG 253,0 402384 8260 /usr/lib64/libpcre.so.1.2.0
|
a.out 188218 root mem REG 253,0 155744 8276 /usr/lib64/libselinux.so.1
|
a.out 188218 root mem REG 253,0 15688 8624 /usr/lib64/libkeyutils.so.1.5
|
a.out 188218 root mem REG 253,0 58728 17811 /usr/lib64/libkrb5support.so.0.1
|
a.out 188218 root mem REG 253,0 90664 8279 /usr/lib64/libz.so.1.2.7
|
a.out 188218 root mem REG 253,0 210768 17803 /usr/lib64/libk5crypto.so.3.1
|
a.out 188218 root mem REG 253,0 15848 8299 /usr/lib64/libcom_err.so.2.1
|
a.out 188218 root mem REG 253,0 963504 17809 /usr/lib64/libkrb5.so.3.3
|
a.out 188218 root mem REG 253,0 320768 17799 /usr/lib64/libgssapi_krb5.so.2.2
|
a.out 188218 root mem REG 253,0 144792 8106 /usr/lib64/libpthread-2.17.so
|
a.out 188218 root mem REG 253,0 44448 8110 /usr/lib64/librt-2.17.so
|
a.out 188218 root mem REG 253,0 2512448 17662 /usr/lib64/libcrypto.so.1.0.2k
|
a.out 188218 root mem REG 253,0 470336 17664 /usr/lib64/libssl.so.1.0.2k
|
a.out 188218 root mem REG 253,0 111080 8108 /usr/lib64/libresolv-2.17.so
|
a.out 188218 root mem REG 253,0 19776 8086 /usr/lib64/libdl-2.17.so
|
a.out 188218 root mem REG 253,0 2127336 8080 /usr/lib64/libc-2.17.so
|
a.out 188218 root mem REG 253,0 751688 806448640 /usr/local/lib64/libgcc_s.so.1
|
a.out 188218 root mem REG 253,0 1139680 8088 /usr/lib64/libm-2.17.so
|
a.out 188218 root mem REG 253,0 1561680 806448628 /usr/local/lib64/libstdc++.so.6.0.24
|
a.out 188218 root mem REG 253,0 955872 1792968 /usr/lib64/libcouchbase.so.2.0.58
|
a.out 188218 root mem REG 253,0 164264 8073 /usr/lib64/ld-2.17.so
|
a.out 188218 root 0u CHR 136,10 0t0 13 /dev/pts/10
|
a.out 188218 root 1u CHR 136,10 0t0 13 /dev/pts/10
|
a.out 188218 root 2u CHR 136,10 0t0 13 /dev/pts/10
|
a.out 188218 root 3u IPv4 11883098 0t0 TCP localhost:57688->localhost:jamlink (ESTABLISHED)
|
a.out 188218 root 4u IPv4 11888032 0t0 TCP localhost.localdomain:59836->172.23.96.20:11210 (ESTABLISHED)
|
a.out 188218 root 5u IPv4 11888033 0t0 TCP localhost.localdomain:46646->172.23.96.16:11210 (ESTABLISHED)
|
a.out 188218 root 6u IPv4 11888034 0t0 TCP localhost.localdomain:60958->172.23.96.17:11210 (ESTABLISHED)
|
Objective of this bug is to reduce the sockets being used by GoCB.
Attachments
Issue Links
- blocks
-
MB-30323 rebalance exited with reason failed to aggregate rebalance progress from all eventing nodes
- Closed