Loading...

XML

Word

Printable

Details

Type: Improvement
Resolution: Unresolved
Priority: Critical
Fix Version/s: 7.6.3
Affects Version/s: 7.6.0, 7.2.0
Component/s: fts
Labels:
None

Story Points:
0

Description

The search engine scores exact matches and fuzzy matches (those at an edit distance of whatever specified) alike - we should score these differently, the closer the hit higher the score.

The same should be applied when tokenization is involved - the largest search token match should score higher.

For example -

If we tokenize "abhi" with an edge_ngram of (2,10) - we achieve these tokens - ["ab", "abh", "abhi"].
Now, if I search for "abhi" - I'll want a document that has "abhi" as an indexed token to score higher than a document that just has "abh".
Two documents that both have "abhi" indexed would score the same, meaning doc1 with "abhi" and doc2 with "abhinav" can score the same per the ngram definition, because it's impossible to predict what else can come after the largest search token.

Attachments

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Abhi Dangeti

Reporter:: Abhi Dangeti

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 24/Apr/24 11:48 AM

Updated:: 24/Apr/24 12:27 PM

Gerrit Reviews

There are no open Gerrit changes

Rank exact/full hits higher than fuzzy/tokenized hits

Details

Description

Attachments

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty