The linked issue has logs and stuff attached.
QE's test basically starts with a 1 node cluster and an empty bucket with 1 replica. It then adds 2 new nodes and kills one as soon as we communicate a rebalance has started. This is inherently racy in that sometimes the node is reported as failed-add (which is tested for), much rarer sometimes goes down then fails over and sometimes refuses to auto fail over and needs manual intervention.
It is this last case that we're interested in specifically as it requires user intervention. In general QE will need to learn to handle non-deterministic behaviour in their test cases.