Description
The search engine scores exact matches and fuzzy matches (those at an edit distance of whatever specified) alike - we should score these differently, the closer the hit higher the score.
The same should be applied when tokenization is involved - the largest search token match should score higher.
For example -
- If we tokenize "abhi" with an edge_ngram of (2,10) - we achieve these tokens - ["ab", "abh", "abhi"].
- Now, if I search for "abhi" - I'll want a document that has "abhi" as an indexed token to score higher than a document that just has "abh".
- Two documents that both have "abhi" indexed would score the same, meaning doc1 with "abhi" and doc2 with "abhinav" can score the same per the ngram definition, because it's impossible to predict what else can come after the largest search token.