Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-18814

[FTS] MCP: Incorrect query results returned during rebalance of fts nodes

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 4.5.0
    • 4.5.0
    • cbft
    • None

    Description

      Build
      4.5.0-1883

      Testcase
      ./testrunner -i INI_FILE.ini -p skip-cleanup=True,get-cbcollect-info=False,get-logs=False,stop-on-failure=False -t fts.moving_topology_fts.MovingTopFTS.rebalance_out_during_querying,items=10000,cluster=D,F,D+F,F,fail-on-errors=False,num_queries=100,GROUP=P0,num_rebalance=2,compare_es=True

      ES need to be configured for the above test.

      Steps:
      1. Cluster: D,F,D+F,F (4 nodes, 3 have fts enabled)
      2. Load 10K docs and build an index.
      3. Run 100 queries, compare results to ES. All passed.
      4. Now trigger rebalance out of 2 fts nodes.
      5. In parallel, fire the same 100 queries with ES validation. 12-15 queries failed with some docs missing from FTS.

      It's interesting to note that the failed queries [36, 37, 39, 40, 43, 45, 47, 48, 49, 57, 58, 59] are almost consecutive indicating a small phase (<1 min) when something goes wrong..

      2016-03-21 15:04:28 | INFO | MainProcess | Cluster_Thread | [task.execute] ------------------------------------------------------------------ Query # 36 -----------------------------------------------------------------
      2016-03-21 15:04:28 | INFO | MainProcess | Cluster_Thread | [fts_base.run_fts_query] Running query {"from": 0, "indexName": "default_index", "fields": [], "explain": false, "ctl": {"timeout": 0, "consistency": {"vectors": {}, "level": ""}}, "query":

      {"field": "manages.reports", "match": "Keelia Kallie Lilith Devi"}

      , "size": 10000000} on node: 172.23.106.175
      :
      :
      2016-03-21 15:05:02 | INFO | MainProcess | Cluster_Thread | [task.execute] ------------------------------------------------------------------ Query # 58 -----------------------------------------------------------------
      2016-03-21 15:05:02 | INFO | MainProcess | Cluster_Thread | [fts_base.run_fts_query] Running query {"from": 0, "indexName": "default_index", "fields": [], "explain": false, "ctl": {"timeout": 0, "consistency": {"vectors": {}, "level": ""}}, "query":

      {"field": "dept", "match": "Finance"}

      , "size": 10000000} on node: 172.23.106.175
      2016-03-21 15:05:02 | INFO | MainProcess | Cluster_Thread | [task.execute] FTS hits for query:

      {"field": "dept", "match": "Finance"}

      is 869 (took 43.308229ms)

      However some queries in this phase are also successful.

      Full testrunner log - https://gist.github.com/arunapiravi/f88e825a81471512955c
      Attaching cbcollect from all nodes.

      Also confirmed that the same failing queries returned expected results past the rebalance phase.

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-18814
          # Subject Branch Project Status CR V

          Activity

            People

              apiravi Aruna Piravi (Inactive)
              apiravi Aruna Piravi (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty