Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-56742

Analytics queries are not honouring timeout and taking long time to process

    XMLWordPrintable

Details

    Description

      While running capella scaling tests, it is observed that some of the analytics queries are not honouring the timeout values and are running for a long time. 

      Steps to reproduce on capella - 

      1. Create a 12 node cluster with 3 KV, 3 N1QL, 3 Index and 3 CBAS node.
      2. Create a KV bucket and create 2 collections under _default scope.
      3. Load 1 million docs in each collection.
      4. Create indexes on collection
      5. Create a total of 10 datasets on either of the collections.
      6. Create indexes on the datasets.
      7. Start parallel analytics queries. For this instance of test 10 query thread are executed parallely from the automation script.
      8. Start scaling of cluster by changing the instance type of nodes.
      9. Automation uses Java SDK client to execute analytics queries. The default timeout for analytics queries is 75 sec. 
      10. It is observed that some queries honour this timeout value and gets cancelled, while some do not and run for a long periods.

      Indexes created on datasets -

      HotelIndexes = [
      "create index {0} on {1}(phone:string)",
      "create index {0} on {1}(`type`:string)",
      "create index {0} on {1}(city:string, country:string)",
      "create index {0} on {1}(price:double)",
      "create index {0} on {1}(avg_rating:double)",
      ] 

      Analytics queries that are executed -

      HotelQueries = [
      "SELECT * from {} where phone like \"4%\" limit 100",
       
      "SELECT * from {} where `type` is not null limit 100",
       
      "SELECT COUNT(*) FILTER (WHERE free_breakfast = TRUE) AS count_free_breakfast, COUNT(*) FILTER (WHERE free_parking = TRUE) AS count_free_parking, COUNT(*) FILTER (WHERE free_breakfast = TRUE AND free_parking = TRUE) AS count_free_parking_and_breakfast FROM {} WHERE city LIKE 'North%' ORDER BY count_free_parking_and_breakfast DESC  limit 100",
       
      "WITH city_avg AS (SELECT city, AVG(price) AS avgprice FROM {0} WHERE price IS NOT NULL GROUP BY city) SELECT h.name, h.price FROM {0} h JOIN city_avg ON h.city = city_avg.city WHERE h.price < city_avg.avgprice AND h.price IS NOT NULL limit 100",
       
      "select city,country,count(*) from {} where free_breakfast=True and free_parking=True group by country,city order by country,city limit 100",
       
      "select avg(price) as AvgPrice, min(price) as MinPrice, max(price) as MaxPrice from {} where free_breakfast=True and free_parking=True and price is not null and array_count(public_likes)>5 and `type`='Hotel' group by country limit 100",
      ]
       
      {} - replace this with any dataset names. Dataset names in the logs with be of format ds0, ds1 .... ds9 

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              umang.agrawal Umang
              umang.agrawal Umang
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty