Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-22090

[FTS] Partial Rollback : Querying fails while rollback is in progress

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 5.0.0
    • 5.0.0
    • fts

    Description

      Build : 4.7.0-1502

      Test (new test, not yet merged) : ./testrunner -i fts.ini -t fts.moving_topology_fts.MovingTopFTS.partial_rollback,cluster=D,D,D+F,items=20000,update=True,upd=30,upd_del_fields=['dept']

      Steps :
      1. 3 nodes in cluster - A=KV, B=KV, C=KV+FTS
      2. Create a bucket with emp dataset with 20000 items
      3. Create FTS index on this bucket and let indexing complete for all docs.
      4. Stop persistence on Nodes A & B
      5. Perform mutations - 30% updates
      6. Run query "mutated:>0" and note # hits.
      7. Kill memcached on Node A so that replicas on Node B become active.
      8. Start persistence on Node B
      9. Failover Node B
      10. Run query "mutated:>0".

      Issue : Step #10 should return partial results as rollback would be in progress in the background at this point of time. But it returns error 400.

      Testrunner log excerpt: (incl the error)

      2016-12-22 22:25:25 | INFO | MainProcess | test_thread | [fts_base.async_perform_update_delete] Batched updates loaded to cluster(s)
      2016-12-22 22:25:25 | INFO | MainProcess | test_thread | [fts_base.sleep] sleep for 60 secs. Waiting for updates to get indexed... ...
      2016-12-22 22:26:25 | INFO | MainProcess | test_thread | [fts_base.wait_for_indexing_complete] Docs in bucket = 20000, docs in FTS index 'default_index': 20000
      2016-12-22 22:26:25 | INFO | MainProcess | test_thread | [fts_base.wait_for_indexing_complete] FTS indexed 20000 docs in 0.0 mins
      2016-12-22 22:26:25 | INFO | MainProcess | test_thread | [fts_base.run_fts_query] Running query {"from": 0, "indexName": "default_index", "fields": [], "explain": false, "ctl": {"timeout": 60000, "consistency": {"vectors": {}, "level": ""}}, "query": {"query": "mutated:>0"}, "size": 10000000} on node: 10.111.151.103:
      2016-12-22 22:26:25 | INFO | MainProcess | test_thread | [moving_topology_fts.partial_rollback] Hits before rollback: 6000
      2016-12-22 22:26:26 | INFO | MainProcess | test_thread | [moving_topology_fts.partial_rollback] Docs in Bucket : 20000, Docs in Index : 20000
      2016-12-22 22:26:26 | INFO | MainProcess | test_thread | [remote_util.__init__] connecting to 10.111.151.101 with username : root password : couchbase ssh_key: 
      2016-12-22 22:26:26 | INFO | MainProcess | test_thread | [remote_util.__init__] Connected to 10.111.151.101
      2016-12-22 22:26:26 | INFO | MainProcess | test_thread | [remote_util.execute_command_raw] running command.raw on 10.111.151.101: kill -9 $(ps aux | grep 'memcached'  | awk '{print $2}')
      2016-12-22 22:26:26 | INFO | MainProcess | test_thread | [remote_util.execute_command_raw] command executed successfully
      2016-12-22 22:26:26 | INFO | MainProcess | test_thread | [data_helper.direct_client] creating direct client 10.111.151.102:11210 default
      2016-12-22 22:26:26 | INFO | MainProcess | test_thread | [fts_base.__async_failover] Starting failover for nodes:[ip:10.111.151.102 port:8091 ssh_username:root] at C1 cluster 10.111.151.101
      2016-12-22 22:26:27 | INFO | MainProcess | Cluster_Thread | [task._failover_nodes] Failing over 10.111.151.102:8091 with graceful=False
      2016-12-22 22:26:27 | INFO | MainProcess | Cluster_Thread | [rest_client.fail_over] fail_over node ns_1@10.111.151.102 successful
      2016-12-22 22:26:27 | INFO | MainProcess | Cluster_Thread | [task.execute] 0 seconds sleep after failover, for nodes to go pending....
      2016-12-22 22:26:27 | INFO | MainProcess | test_thread | [rest_client.get_nodes] Node 10.111.151.102 not part of cluster inactiveFailed
      2016-12-22 22:26:27 | INFO | MainProcess | test_thread | [fts_base.run_fts_query] Running query {"from": 0, "indexName": "default_index", "fields": [], "explain": false, "ctl": {"timeout": 60000, "consistency": {"vectors": {}, "level": ""}}, "query": {"query": "mutated:>0"}, "size": 10000000} on node: 10.111.151.103:
      2016-12-22 22:26:27 | INFO | MainProcess | test_thread | [moving_topology_fts.partial_rollback] Hits after rollback: 6000
      2016-12-22 22:26:27 | INFO | MainProcess | test_thread | [fts_base.sleep] sleep for 20 secs.  ...
      2016-12-22 22:26:47 | INFO | MainProcess | test_thread | [rest_client.get_nodes] Node 10.111.151.102 not part of cluster inactiveFailed
      2016-12-22 22:26:47 | INFO | MainProcess | test_thread | [fts_base.run_fts_query] Running query {"from": 0, "indexName": "default_index", "fields": [], "explain": false, "ctl": {"timeout": 60000, "consistency": {"vectors": {}, "level": ""}}, "query": {"query": "mutated:>0"}, "size": 10000000} on node: 10.111.151.103:
      2016-12-22 22:26:47 | ERROR | MainProcess | test_thread | [rest_client._http_request] http://10.111.151.103:8094/api/index/default_index/query error 400 reason: status: 400, content: rest_index: Query, indexName: default_index, requestBody: {"from": 0, "indexName": "default_index", "fields": [], "explain": false, "ctl": {"timeout": 60000, "consistency": {"vectors": {}, "level": ""}}, "query": {"query": "mutated:>0"}, "size": 10000000}, req: &http.Request{Method:"POST", URL:(*url.URL)(0xc42d9d6680), Proto:"HTTP/1.1", ProtoMajor:1, ProtoMinor:1, Header:http.Header{"Content-Type":[]string{"application/json"}, "Authorization":[]string{"Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA=="}, "Accept":[]string{"*/*"}, "User-Agent":[]string{"Python-httplib2/$Rev: 259 $"}, "Accept-Encoding":[]string{"identity"}, "Content-Length":[]string{"197"}}, Body:(*http.body)(0xc42227fb40), ContentLength:197, TransferEncoding:[]string(nil), Close:false, Host:"10.111.151.103:8094", Form:url.Values{}, PostForm:url.Values{}, MultipartForm:(*multipart.Form)(nil), Trailer:http.Header(nil), RemoteAddr:"10.111.151.1:55048", RequestURI:"/api/index/default_index/query", TLS:(*tls.ConnectionState)(nil), Cancel:(<-chan struct {})(nil), Response:(*http.Response)(nil), ctx:(*context.cancelCtx)(0xc42227fb80)}, err: bleve: bleveIndexTargets, nil bindex, localPIndex: &cbgt.PIndex{Name:"default_index_34fc9d91a5ea75a1_9e13fa43", UUID:"41b4b17b3dc4ca46", IndexType:"fulltext-index", IndexName:"default_index", IndexUUID:"34fc9d91a5ea75a1", IndexParams:"{}", SourceType:"couchbase", SourceName:"default", SourceUUID:"", SourceParams:"{\"authPassword\":\"\",\"authSaslPassword\":\"\",\"authSaslUser\":\"\",\"authUser\":\"default\",\"clusterManagerBackoffFactor\":0,\"clusterManagerSleepInitMS\":0,\"clusterManagerSleepMaxMS\":20000,\"dataManagerBackoffFactor\":0,\"dataManagerSleepInitMS\":0,\"dataManagerSleepMaxMS\":20000,\"feedBufferAckThreshold\":0,\"feedBufferSizeBytes\":0}", SourcePartitions:"448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,469,470,471,472,473,474,475,476,477,478,479", Path:"/opt/couchbase/var/lib/couchbase/data/@fts/default_index_34fc9d91a5ea75a1_9e13fa43.pindex", Impl:(*bleve.indexImpl)(0xc4203be2a0), Dest:(*cbgt.DestForwarder)(0xc4201a5f30), sourcePartitionsMap:map[string]bool{"454":true, "457":true, "469":true, "471":true, "475":true, "477":true, "452":true, "466":true, "470":true, "468":true, "473":true, "474":true, "450":true, "455":true, "458":true, "460":true, "462":true, "463":true, "465":true, "467":true, "448":true, "449":true, "451":true, "472":true, "478":true, "479":true, "461":true, "464":true, "453":true, "456":true, "459":true, "476":true}, m:sync.Mutex{state:0, sema:0x0}, closed:true}
       rest_index: Query, indexName: default_index, requestBody: {"from": 0, "indexName": "default_index", "fields": [], "explain": false, "ctl": {"timeout": 60000, "consistency": {"vectors": {}, "level": ""}}, "query": {"query": "mutated:>0"}, "size": 10000000}, req: &http.Request{Method:"POST", URL:(*url.URL)(0xc42d9d6680), Proto:"HTTP/1.1", ProtoMajor:1, ProtoMinor:1, Header:http.Header{"Content-Type":[]string{"application/json"}, "Authorization":[]string{"Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA=="}, "Accept":[]string{"*/*"}, "User-Agent":[]string{"Python-httplib2/$Rev: 259 $"}, "Accept-Encoding":[]string{"identity"}, "Content-Length":[]string{"197"}}, Body:(*http.body)(0xc42227fb40), ContentLength:197, TransferEncoding:[]string(nil), Close:false, Host:"10.111.151.103:8094", Form:url.Values{}, PostForm:url.Values{}, MultipartForm:(*multipart.Form)(nil), Trailer:http.Header(nil), RemoteAddr:"10.111.151.1:55048", RequestURI:"/api/index/default_index/query", TLS:(*tls.ConnectionState)(nil), Cancel:(<-chan struct {})(nil), Response:(*http.Response)(nil), ctx:(*context.cancelCtx)(0xc42227fb80)}, err: bleve: bleveIndexTargets, nil bindex, localPIndex: &cbgt.PIndex{Name:"default_index_34fc9d91a5ea75a1_9e13fa43", UUID:"41b4b17b3dc4ca46", IndexType:"fulltext-index", IndexName:"default_index", IndexUUID:"34fc9d91a5ea75a1", IndexParams:"{}", SourceType:"couchbase", SourceName:"default", SourceUUID:"", SourceParams:"{\"authPassword\":\"\",\"authSaslPassword\":\"\",\"authSaslUser\":\"\",\"authUser\":\"default\",\"clusterManagerBackoffFactor\":0,\"clusterManagerSleepInitMS\":0,\"clusterManagerSleepMaxMS\":20000,\"dataManagerBackoffFactor\":0,\"dataManagerSleepInitMS\":0,\"dataManagerSleepMaxMS\":20000,\"feedBufferAckThreshold\":0,\"feedBufferSizeBytes\":0}", SourcePartitions:"448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,469,470,471,472,473,474,475,476,477,478,479", Path:"/opt/couchbase/var/lib/couchbase/data/@fts/default_index_34fc9d91a5ea75a1_9e13fa43.pindex", Impl:(*bleve.indexImpl)(0xc4203be2a0), Dest:(*cbgt.DestForwarder)(0xc4201a5f30), sourcePartitionsMap:map[string]bool{"454":true, "457":true, "469":true, "471":true, "475":true, "477":true, "452":true, "466":true, "470":true, "468":true, "473":true, "474":true, "450":true, "455":true, "458":true, "460":true, "462":true, "463":true, "465":true, "467":true, "448":true, "449":true, "451":true, "472":true, "478":true, "479":true, "461":true, "464":true, "453":true, "456":true, "459":true, "476":true}, m:sync.Mutex{state:0, sema:0x0}, closed:true}
      2016-12-22 22:26:47 | ERROR | MainProcess | test_thread | [fts_base.execute_query] Error running query: 'NoneType' object is not iterable
      2016-12-22 22:26:47 | INFO | MainProcess | test_thread | [moving_topology_fts.partial_rollback] Hits after rollback and one more sleep: -1
      2016-12-22 22:26:47 | INFO | MainProcess | test_thread | [rest_client.get_nodes] Node 10.111.151.102 not part of cluster inactiveFailed
      2016-12-22 22:26:47 | INFO | MainProcess | test_thread | [moving_topology_fts.partial_rollback] Docs in Bucket : 20000, Docs in Index : 15000
      

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              apiravi Aruna Piravi (Inactive)
              mihir.kamdar Mihir Kamdar (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty