Loading...

XML

Word

Printable

Can be easily reproduced by introducing a crash in perform_actual_join() like so:

$ git diff

diff --git i/src/ns_cluster.erl w/src/ns_cluster.erl

index a72c95aab..1387d3c3f 100644

--- i/src/ns_cluster.erl

+++ w/src/ns_cluster.erl

@@ -1327,6 +1327,8 @@ perform_actual_join(RemoteNode, NewCookie, ChronicleInfo) ->

         ns_cluster_membership:prepare_to_join(RemoteNode, NewCookie),

         ok = chronicle_local:prepare_join(ChronicleInfo),

+        exit(crash),

         %% reload is needed to reinitialize ns_config's cache after

         %% config cleanup ('erase' causes the problem, but it looks like

         %% it's not worth it to add proper 'erase' support to ns_config)

backports to

MB-49735 [Backport MB-46153] Node that fails to join the cluster might destroy all the data in that cluster

A node crashing during completeJoin goes into an infinite restart loop