Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-20992

[FTS] query string query with only -termanalyzedtonothing different results from ES

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • 5.0.0
    • 5.0.0
    • fts
    • None
    • Untriaged
    • Unknown

    Description

      Originally part of MB-20566, but broken out into separate issue, so one can be resolved and this one deferred.

      To reproduce:

      ./testrunner -i INI_FILE.ini get-cbcollect-info=True,get-coredumps=True,get-logs=False,stop-on-failure=False,cluster=D+F,GROUP=ALL -t fts.stable_topology_fts.StableTopFTS.index_query_custom_mapping,items=1000,custom_map=True,num_custom_analyzers=1,compare_es=True,cm_id=56,num_queries=100,GROUP=P0
      

      The remaining issue in this test is that:

      2016-09-15 09:05:03 | INFO | MainProcess | Cluster_Thread | [task.execute] ------------------------------------------------------------------ Query # 57 -----------------------------------------------------------------
      2016-09-15 09:05:03 | INFO | MainProcess | Cluster_Thread | [fts_base.run_fts_query] Running query {"from": 0, "indexName": "custom_index", "fields": [], "explain": false, "ctl": {"timeout": 60000, "consistency": {"vectors": {}, "level": ""}}, "query": {"query": "-languages_known:\"German\""}, "size": 10000000} on node: 127.0.0.1:9201
      2016-09-15 09:05:03 | INFO | MainProcess | Cluster_Thread | [task.execute] Status: {u'successful': 32, u'failed': 0, u'total': 32}
      2016-09-15 09:05:03 | INFO | MainProcess | Cluster_Thread | [task.execute] FTS hits for query: {"query": "-languages_known:\"German\""} is 1000 (took 26.677828ms)
      2016-09-15 09:05:03 | INFO | MainProcess | Cluster_Thread | [task.execute] ES hits for query: {"query_string": {"query": "-languages_known:\"German\""}} on es_index is 0 (took 1ms)
      2016-09-15 09:05:03 | ERROR | MainProcess | Cluster_Thread | [task.execute] FAIL: FTS hits: 1000, while ES hits: 0
      2016-09-15 09:05:03 | ERROR | MainProcess | Cluster_Thread | [task.execute] FAIL: Following 1000 doc(s) were not returned by ES,but FTS, printing 50: [u'emp10000538', u'emp10000539', u'emp10000536', u'emp10000537', u'emp10000534', u'emp10000535', u'emp10000532', u'emp10000533', u'emp10000530', u'emp10000531', u'emp10000125', u'emp10000436', u'emp10000127', u'emp10000126', u'emp10000121', u'emp10000120', u'emp10000431', u'emp10000122', u'emp10000129', u'emp10000128', u'emp10000439', u'emp10000438', u'emp10000472', u'emp10000509', u'emp10000508', u'emp10000620', u'emp10000471', u'emp10000222', u'emp10000223', u'emp10000549', u'emp10000221', u'emp10000226', u'emp10000227', u'emp10000224', u'emp10000225', u'emp10000543', u'emp10000542', u'emp10000228', u'emp10000229', u'emp10000547', u'emp10000546', u'emp10000545', u'emp10000544', u'emp10000983', u'emp10000982', u'emp10000981', u'emp10000980', u'emp10000987', u'emp10000986', u'emp10000985']
      

      So, FTS returns all docs, and ES returns none. The behavior has to do with what happens when the search text being analyzed results in 0 tokens. ES will drop this clause entirely when parsing the query string. FTS keeps the clause when building an equivalent BooleanQuery.

      Why can't we just change the way the BooleanQuery behaves? Because then that would work different from ES. NOTE that ES does return ALL DOCS for this query:

      {
        "query": {
          "bool": {
            "must_not": {
              "match": {
                "message": {
                  "query": "the",
                  "analyzer": "english"
                }
              }
            }
          }
        }
      }
      

      Attachments

        For Gerrit Dashboard: MB-20992
        # Subject Branch Project Status CR V

        Activity

          People

            apiravi Aruna Piravi (Inactive)
            mschoch Marty Schoch [X] (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty