Uploaded image for project: 'Couchbase Python Client Library'
  1. Couchbase Python Client Library
  2. PYCBC-1471

acouchbase streaming API blocks while when executing queries

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 4.1.4
    • 4.1.3
    • library
    • None
    • 0
    • SDK16, SDK18

    Description

      We've migrated a legacy application from SDK2 to SDK4 and we see some page loads take 20+ seconds. After some debugging we've isolated the issue to the asyncio event loop being blocked while running a query. Note the warning from asyncio.

      2023-04-10T22:35:55.734611821Z DEBUG:couchbase_cxx_client_file_logger:QUERY: client_context_id="25ad287c-5385-4d95-ba0c-d45e401c81e4", prep=false, "SELECT timestamp, directories FROM supportal WHERE meta().id LIKE \"s3checker::stats::%::%\" AND `node` = $node AND timestamp > $last_ts ORDER BY timestamp ASC LIMIT $limit"
      2023-04-10T22:35:55.734627333Z DEBUG:couchbase_cxx_client_file_logger:[6bce62b7-1e0d-4c2e-2aaf-8fee03b28712/d2e9c66a-4731-4a64-a536-2fafc4248c3d] <172.24.19.21:8093> HTTP request: query, method=POST, path="/query/service", client_context_id="25ad287c-5385-4d95-ba0c-d45e401c81e4", timeout=75000ms
      2023-04-10T22:35:56.084149531Z DEBUG:couchbase_cxx_client_file_logger:[6bce62b7-1e0d-4c2e-2aaf-8fee03b28712/d2e9c66a-4731-4a64-a536-2fafc4248c3d] <172.24.19.21:8093> HTTP response: query, client_context_id="25ad287c-5385-4d95-ba0c-d45e401c81e4", status=200, body=[hidden]
      2023-04-10T22:35:56.084708234Z DEBUG:couchbase_cxx_client_file_logger:[6bce62b7-1e0d-4c2e-2aaf-8fee03b28712/d2e9c66a-4731-4a64-a536-2fafc4248c3d] <172.24.19.21:8093> put HTTP session back to idle connections
      2023-04-10T22:35:56.085869861Z INFO:tornado.access:200 GET /audit/data/s3checker/s3checker-slave-546db996c4-kg8x6?limit=60&last_ts=2023-04-10T21%3A35%3A03.434292%2B00%3A00&generateTs=false&withHistory=false (10.42.4.177) jake.rawsthorne@couchbase.com 3779.43ms
      2023-04-10T22:35:56.086004725Z WARNING:asyncio:Executing <Task finished name='Task-74' coro=<RequestHandler._execute() done, defined at /usr/local/lib/python3.10/site-packages/tornado/web.py:1665> result=None created at /usr/local/lib/python3.10/asyncio/tasks.py:636> took 0.352 seconds
      

      I can reproduce the issue with the below code. I would expect all of the queries to be run in parallel however they run sequentially. If I replace the query with a call to asyncio.sleep() everything works as expected.

      import asyncio
      from acouchbase.cluster import Cluster
      from couchbase.auth import PasswordAuthenticator
      from couchbase.options import ClusterOptions
      import time
       
       
      async def run_query(cluster):
          start = time.time()
          print("start query at ", time.time())
          # Any slow query. I used a UDF that loops 1mil times.
          result = cluster.query("EXECUTE FUNCTION slow()")
          async for _ in result:
              pass
          print("end query at ", time.time())
          print("query took ", time.time() - start)
       
       
      async def main():
          cluster = Cluster('couchbase://localhost',
                            ClusterOptions(PasswordAuthenticator('Administrator', 'password')))
          await cluster.on_connect()
          start = time.time()
          print("start at ", time.time())
          tasks = [asyncio.create_task(run_query(cluster)) for _ in range(10)]
          await asyncio.gather(*tasks)
          print("end at ", time.time())
          print("took ", time.time() - start)
       
      if __name__ == '__main__':
          asyncio.run(main())
      
      

      start at  1681206779.9860458
      start query at  1681206779.98626
      end query at  1681206789.620456
      query took  9.634260892868042
      start query at  1681206789.62063
      end query at  1681206799.364358
      query took  9.743772745132446
      start query at  1681206799.364442
      end query at  1681206809.125785
      query took  9.761384010314941
      start query at  1681206809.125864
      end query at  1681206819.00722
      query took  9.881402254104614
      start query at  1681206819.007304
      end query at  1681206828.9409332
      query took  9.933715105056763
      start query at  1681206828.941061
      end query at  1681206838.6779
      query took  9.736899852752686
      start query at  1681206838.6780171
      end query at  1681206848.292246
      query took  9.614297151565552
      start query at  1681206848.292369
      end query at  1681206857.846459
      query took  9.554147005081177
      start query at  1681206857.846571
      end query at  1681206867.446794
      query took  9.600282192230225
      start query at  1681206867.4469109
      end query at  1681206877.057766
      query took  9.610919952392578
      end at  1681206877.058383
      took  97.07235932350159
      

      Attachments

        For Gerrit Dashboard: PYCBC-1471
        # Subject Branch Project Status CR V

        Activity

          People

            jared.casey Jared Casey
            jake.rawsthorne Jake Rawsthorne
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty