Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-33512

FTS - Provide more accurate scoring mechanism between partitions

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Unresolved
    • Major
    • feature-backlog
    • 5.5.3, 6.0.1, 6.5.0
    • fts
    • None

    Description

      Currently all of the tf-idf scoring is done on a per-pindex basis (each of these is an individual bleve index).

      This is a well understood problem, even in other search engines such as ElasticSearch, where the partitioned nature causes inconsistent scoring between documents.
      Generally though, this doesn't matter as each partition would have a large number of docs so the discrepancy between scores should be minimal.

      ElasticSearch added a new type of query 'dfs_query_then_fetch', described in https://www.elastic.co/blog/understanding-query-then-fetch-vs-dfs-query-then-fetch, which makes the scoring more accurate between partitions at the cost of doing extra roundtrips between the partitions.

      It would be good if Couchbase Server FTS offered a similar mechanism to trade off performance in cases where you require more accurate scores.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            abhinav Abhi Dangeti
            matt.carabine Matt Carabine (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty