Details
-
Improvement
-
Resolution: Fixed
-
Critical
-
3.0, 4.0.0
-
Security Level: Public
-
Sprint 2 - March 11 - April 3
Description
We have a number of bugs due to erlang global facility or related issue of not being able to spawn new master quickly. I.e.:
MB-7282(erlang's global naming facility apparently drops globally registered service with actual service still alive (was: impossible to change settings/autoFailover after rebalance))
MB-7168[Doc'd 2.2.0] failover of node that's completely down is still not quick (was: Rebalance exited with reason {not_all_nodes_are_ready_yet after failover node)
MB-8682start rebalance request is hunging sometimes (looks like another global facility issue)
MB-5622Crash of master node may lead to autofailover in 2 minutes instead of configured shorter autofailover period or similarly slow manual failover
By getting us off global, we will fix all this issues.
Attachments
Issue Links
- blocks
-
MB-12739 Improve Auto-failover for RZA
- Resolved
- is duplicated by
-
MB-7282 erlang's global naming facility apparently drops globally registered service with actual service still alive (was: impossible to change settings/autoFailover after rebalance)
- Closed
-
MB-9691 rebalance repeated failed when add nodes back into cluster
- Closed
-
MB-5622 Crash of master node may lead to autofailover in 2 minutes instead of configured shorter autofailover period or similarly slow manual failover
- Closed
- relates to
-
MB-9415 auto-failover in seconds - (reduced from minimum 30 seconds)
- Resolved
-
MB-9691 rebalance repeated failed when add nodes back into cluster
- Closed
-
MB-14967 Spend some time looking for a workaround for a problem with erlang global facility (was: [system test] Rebalance In fails with error " Request failed/" errors.)
- Closed
-
MB-22807 Failover of node taking ~15 sec when the node down is orchestrator node and timeout is 5 sec
- Closed
-
MB-11614 Discussion - Should we move auto-failover out of erlang?
- Open
-
MB-9066 Increase Autofailover Counter: Enable setting the number of auto-failovers allowed on cluster
- Resolved