Description
Running a 4 node cluster with cluster_run on OSx.
In the UI
- select failover node 1
- select graceful failover
- select delta recovery
When the failover has finished
-select rebalance
When the rebalance has finished - Select failover of node 1 again same as above
The UI now hangs showing "Failing over 1 node" but no progress is made.
ns_rebalancer is blocked in wait_for_mover_tail
erlang:process_info(Pid).
|
[{current_function,{ns_rebalancer,wait_for_mover_tail,2}},
|
{initial_call,{proc_lib,init_p,5}},
|
{status,waiting},
|
{message_queue_len,0},
|
{messages,[]},
|
{links,[<0.1368.0>,<0.18016.2>]},
|
{dictionary,[{'$ancestors',[<0.1368.0>,ns_orchestrator_sup,
|
mb_master_sup,mb_master,<0.690.0>,ns_server_sup,
|
ns_server_nodes_sup,<0.155.0>,ns_server_cluster_sup,
|
<0.89.0>]},
|
{'$initial_call',{erlang,apply,2}}]},
|
{trap_exit,false},
|
{error_handler,error_handler},
|
{priority,normal},
|
{group_leader,<0.88.0>},
|
{total_heap_size,318186},
|
{heap_size,121536},
|
{stack_size,24},
|
{reductions,256621},
|
{garbage_collection,[{min_bin_vheap_size,46422},
|
{min_heap_size,233},
|
{fullsweep_after,512},
|
{minor_gcs,1}]},
|
{suspending,[]}]
|
ns_vbucket_mover got an empty Actions list
[ns_server:debug,2016-04-26T18:50:20.076-05:00,n_0@192.168.1.70:<0.18016.2>:ns_vbucket_mover:spawn_workers:326]Got actions: []
|
But vbucket_move_scheduler:is_done, does not return true. vbucket_move_scheduler’s state seems to have 490 moves_left.
It is blocked in gen_server:loop.
I have confirmed that this works in sherlock