Details
-
Bug
-
Resolution: Duplicate
-
Critical
-
Cheshire-Cat
-
7.0.0-4454-enterprise
-
Untriaged
-
Centos 64-bit
-
-
1
-
Unknown
Description
Build: 7.0.0-4454
Scenario:
- Initialize cluster with two nodes (kv, kv+index+n1ql)
- Create couchbase bucket with replica=1
- Rebalance_in 2 nodes into the cluster with doc cruds in parallel (Success)
- Rebalance_in 2 more nodes with doc_cruds (Success)
- Final Cluster stats:
+----------------+-----------------+------+------------+------------+----------------------+------------------+
| Node | Services | CPU | Mem_total | Mem_free | Swap_mem_used | Active / Replica |
+----------------+-----------------+------+------------+------------+----------------------+------------------+
| 172.23.105.126 | kv | 6.43 | 4201627648 | 3425701888 | 1048576 / 3758092288 | 4989 / 5108 |
| 172.23.105.128 | kv | 6.16 | 4201627648 | 3424706560 | 0 / 3758092288 | 5127 / 4934 |
| 172.23.104.172 | index, kv, n1ql | 11.7 | 3947372544 | 3012943872 | 221184 / 3758092288 | 4982 / 5103 |
| 172.23.105.127 | kv | 4.88 | 4201627648 | 3397816320 | 0 / 3758092288 | 5075 / 5043 |
| 172.23.105.158 | kv | 5.79 | 4201631744 | 3393880064 | 0 / 3758092288 | 4936 / 4884 |
| 172.23.104.158 | kv | 14.8 | 4201676800 | 3443019776 | 1310720 / 3758092288 | 4891 / 4928 |
+----------------+-----------------+------+------------+------------+----------------------+------------------+
- Rebalance out all nodes
Observation:
During final rebalance out of all nodes, seeing rebalance failure due to memcached getting killing with exit code 137 with following logs,
Service 'memcached' exited with status 137. Restarting. Messages:WARNING: Logging before InitGoogleLogging() is written to STDERRW0216 00:59:13.516377 22657 HazptrDomain.h:671] Using the default inline executor for asynchronous reclamation may be susceptible to deadlock if the current thread happens to hold a resource needed by the deleter of a reclaimable object
|
Rebalance failure UI logs:
Node 'ns_1@172.23.104.172' saw that node 'ns_1@172.23.105.158' went down. Details: [{nodedown_reason, connection_closed}]
|
Node 'ns_1@172.23.104.158' saw that node 'ns_1@172.23.105.158' went down. Details: [{nodedown_reason, connection_closed}]
|
Rebalance exited with reason shun_failed.
|
Rebalance Operation Id = 9ebe83ad7194372a38613770f88d57a1
|
Node 'ns_1@172.23.105.158' is leaving cluster."}
|
Node 'ns_1@172.23.104.172' saw that node 'ns_1@172.23.105.127' went down. Details: [{nodedown_reason, connection_closed}]
|
Node 'ns_1@172.23.104.158' saw that node 'ns_1@172.23.105.127' went down. Details: [{nodedown_reason, connection_closed}]
|
Node 'ns_1@172.23.105.158' saw that node 'ns_1@172.23.105.127' went down. Details: [{nodedown_reason, connection_closed}]
|
Node 'ns_1@172.23.105.127' is leaving cluster.
|
Node 'ns_1@172.23.104.172' saw that node 'ns_1@172.23.105.128' went down. Details: [{nodedown_reason, connection_closed}]
|
Node 'ns_1@172.23.104.158' saw that node 'ns_1@172.23.105.128' went down. Details: [{nodedown_reason, connection_closed}]
|
Attachments
Issue Links
- duplicates
-
MB-44272 Rebalance exited with reason shun_failed
- Closed