Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-14124

During and just after Rebalance+hard failover, query (using view based primary index) with scan_consistency=request_plus returns incorrect count

    XMLWordPrintable

Details

    • Bug
    • Resolution: Won't Fix
    • Critical
    • 4.1.0
    • 4.0.0
    • None
    • Security Level: Public
    • Issue observed in 1655 and repro on dev box with latest code - March 25, 9:00 am

    Description

      ./testrunner -i b/resources/dev-6-nodes-xdcr_n1ql_2i.ini doc-per-day=10,skip_build_tuq=True,cbq_version=sherlock,scan_consistency=request_plus -t 2i.recovery_2i.SecondaryIndexingRecoveryTests.test_failover,before=create_index,in_between=query_ops,after=query_ops,groups=simple,dataset=default,doc-per-day=10,services_init=n1ql:kv-kv-kv-index-index,nodes_init=5,nodes_out=2,nodes_out_dist=kv:1-index:1,skip_cleanup=True

      1. Create 5 node cluster (1:n1ql, kv 2: kv, 3: kv 4: index 5:index)
      2. Create default bucket and add 20160 items - Employee dataset
      3. Create PRIMARY INDEX based on views and wait till it comes online
      5. Create 5 GSI based indexes, wait till it comes online
      6. Failover 1 kv node + 1 index node and Rebalance

      During rebalance and even after, we run queries with scan_consistency=request_plus

      SELECT * FROM default WHERE join_yr > 2010 and join_yr < 2014 ORDER BY _id

      Results are incorrect.Actual num 6705. Expected num: 10080.

      EXPLAIN OF THE QUERY

      2015-03-25 12:48:07 | INFO | MainProcess | Cluster_Thread | [rest_client.rebalance] rebalance operation started
      2015-03-25 12:48:07 | INFO | MainProcess | Cluster_Thread | [tuq_helper.run_cbq_query] RUN QUERY EXPLAIN SELECT * FROM default WHERE join_yr > 2010 and join_yr < 2014 ORDER BY _id
      2015-03-25 12:48:07 | INFO | MainProcess | Cluster_Thread | [rest_client.query_tool] query params : statement=EXPLAIN+SELECT+%2A+FROM+default+WHERE+join_yr%3E+2010+and+join_yr+%3C+2014+ORDER+BY+_id+
      2015-03-25 12:48:07 | INFO | MainProcess | Cluster_Thread | [tuq_helper.run_cbq_query] TOTAL ELAPSED TIME: 38.541034ms
      2015-03-25 12:48:07 | INFO | MainProcess | Cluster_Thread | [task.execute] {u'status': u'success', u'metrics':

      {u'elapsedTime': u'38.541034ms', u'executionTime': u'38.465853ms', u'resultSize': 2216, u'resultCount': 1}

      , u'results': [{u'#operator': u'Sequence', u'~children': [{u'#operator': u'Sequence', u'~children': [

      {u'index': u'#primary', u'#operator': u'PrimaryScan', u'namespace': u'default', u'using': u'view', u'keyspace': u'default'}

      , {u'#operator': u'Parallel', u'~child': {u'#operator': u'Sequence', u'~children': [

      {u'keyspace': u'default', u'#operator': u'Fetch', u'namespace': u'default'}

      ,

      {u'#operator': u'Filter', u'condition': u'((2010 < (`default`.`join_yr`)) and ((`default`.`join_yr`) < 2014))'}

      , {u'#operator': u'InitialProject', u'result_terms': [

      {u'star': True}

      ]}]}}]}, {u'#operator': u'Order', u'sort_terms': [

      {u'expr': u'(`default`.`_id`)'}

      ]}, {u'#operator': u'Parallel', u'~child': {u'#operator': u'FinalProject'}}]}], u'requestID': u'bf9dbc2f-831d-4a7f-b8fb-aec0e24503a8', u'signature': u'json'}

      After the test failed. I ran the following command and saw that it returning the correct result. Opening a bug as to why it doesn't return correct values during times rebalance is running or just after it.

      Parags-MacBook-Pro-2:testrunner parag$ curl -u Administrator:asdasd -v http://localhost:9499/query/service -d 'statement=SELECT count FROM default WHERE join_yr > 2010 and join_yr < 2014 ORDER BY _id&scan_consistency=REQUEST_PLUS'

      • Adding handle: conn: 0x7ff909004400
      • Adding handle: send: 0
      • Adding handle: recv: 0
      • Curl_addHandleToPipeline: length: 1
      • - Conn 0 (0x7ff909004400) send_pipe: 1, recv_pipe: 0
      • About to connect() to localhost port 9499 (#0)
      • Trying 127.0.0.1...
      • Connected to localhost (127.0.0.1) port 9499 (#0)
      • Server auth using Basic with user 'Administrator'
        > POST /query/service HTTP/1.1
        > Authorization: Basic QWRtaW5pc3RyYXRvcjphc2Rhc2Q=
        > User-Agent: curl/7.30.0
        > Host: localhost:9499
        > Accept: /
        > Content-Length: 122
        > Content-Type: application/x-www-form-urlencoded
        >
      • upload completely sent off: 122 out of 122 bytes
        < HTTP/1.1 200 OK
        < Content-Length: 381
        < Content-Type: application/json; version=0.8.0
        < Date: Wed, 25 Mar 2015 19:50:33 GMT
        <
        {
        "requestID": "6a5a1f1e-91b4-4814-a792-6d6288f8a7b1",
        "signature": { "$1": "number" }

        ,
        "results": [

        { "$1": 10080 }

        ],
        "status": "success",
        "metrics":

        { "elapsedTime": "1.511045205s", "executionTime": "1.510972654s", "resultCount": 1, "resultSize": 35, "sortCount": 1 }

        }

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              keshav Keshav Murthy
              parag Parag Agarwal (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty