There is a bug with node failover scenario. If a user does not bring a replacement node before the failover occurred and rebalances then the system will prevent any lost actives or replicas for search indexes that had partitions residing on the affected node from being rebuilt. As a result, search requests will return partial/incomplete results. In addition, existing indexes with defined replica(s) became vulnerable to additional node failure.
- Create a cluster with 3 server groups with following config :
(x:y:z means 1 node with x service, 2nd node with y service and 3rd node with z service)
- Create fts index on default bucket
- Run a fts query
- Failover FTS nodes in serverGroup3
- Let only server group having max no of partitions stay and failover rest FTS nodes
- Find fts node in maximal server group holding min index partitions
- To test min fts node from maximal server group individually, failover over all the fts nodes from maximal server group
- Run the same query as run in step 3
- Assert same no of results <– FAILURE
Seeing less no of query hits in step 9 as compared to step 3.