Uploaded image for project: 'Distributed Transactions Java'
  1. Distributed Transactions Java
  2. TXNJ-99

Lesser GET & SET ops with high CPU usage (at server side) while running with transactions as compared to regular KV loadtest

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Not a Bug
    • 1.0.0-beta.1
    • None
    • None

    Description

      Observing lesser number of GET & SET operations while running transactions as compared to regular workloadA load test .

      Also high CPU utilization 

      Here is a comparison of two load tests with durability set to None .

       

      stats Transaction Test KV Load Test
      OPS ~11000 ops/sec (7748 trans per sec) ~328880 ops /sec
      cmd_get ~37000 ~164440
      cmd_set ~70000 ~164440
      Throughput  7748 trans/sec ~328880  ops/sec
      server side cpu utilization (%)  ~ 90 % ~90 %
      workload 1 Transaction = 4 READ + 3 UPDATE  1 OPS = 1 READ or 1 UPDATE
      workload Distribution  100% transactions 50:50 READ:UPDATE

       

      Cluster Config :  4 Nodes, 2 Replicas , 12 vCPU, 64 GB RAM

      Test Config : 10M Items , 1KB docSize

      Client Info : YCSB , 1.0.0-beta.1 3.0.0-alpha.6 , Uniform requestdistribution, 480 concurrent workers

      WORKLOADTA : Number of ops in Single Transaction 4 , 4 READS, 3 UPDATE, Durability 0

       

      *Table updated with most recent numbers & test config . 

       

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          graham.pople Graham Pople added a comment -

          Ok, closing it out.  It turned into quite a sprawling ticket with various ideas and testing of performance improvements, but the core takeaway is that transactions performance is reasonably close to the maximum currently possible, considering durability performance and the current protocol.  There is at least two ways the protocol can be improved in the future, performance-wise, which have been logged separately.

          graham.pople Graham Pople added a comment - Ok, closing it out.  It turned into quite a sprawling ticket with various ideas and testing of performance improvements, but the core takeaway is that transactions performance is reasonably close to the maximum currently possible, considering durability performance and the current protocol.  There is at least two ways the protocol can be improved in the future, performance-wise, which have been logged separately.

          Graham Pople - I guess we can close this ticket .

           

          sharath.sulochana Sharath Sulochana (Inactive) added a comment - Graham Pople  - I guess we can close this ticket .  

          Hi Sharath Sulochana , do you think we still need this ticket open?

          graham.pople Graham Pople added a comment - Hi Sharath Sulochana , do you think we still need this ticket open?
          graham.pople Graham Pople added a comment - - edited

          Shivani Gupta it's the latter - the ATRs are created on-demand.

          Admittedly, the numbers above do indicate that the number of ATRs isn't much of a factor when Durability is enabled.  Though countering that is issue MB-35359, where transactions were actually expiring due to congestion on the ATRs.  It's been worked around, as newer Java clients now automatically retry on that error for up to 2.5 seconds, so the issue is closed - but it remains indicative that there's heavy congestion going on behind the scenes.

          graham.pople Graham Pople added a comment - - edited Shivani Gupta it's the latter - the ATRs are created on-demand. Admittedly, the numbers above do indicate that the number of ATRs isn't much of a factor when Durability is enabled.  Though countering that is issue MB-35359 , where transactions were actually expiring due to congestion on the ATRs.  It's been worked around, as newer Java clients now automatically retry on that error for up to 2.5 seconds, so the issue is closed - but it remains indicative that there's heavy congestion going on behind the scenes.

          Graham Pople question for you:

          Today do you create the 1024 ATRs upfront or the first time a transaction hits its first document on a given vbucket?

          Because when I run small experimental tests I do not see 1024 ATRs right away (which is a relief actually!).

          I agree with your concern that if we go with 20 *1024 ATRs, they will flood the users bucket. Until we have a way of filtering out ATRs into a special System Collection I would not create tons of them.

          Also, given that with Durability level = majority, the bottleneck is elsewhere rather on ATR contention, I am less convinced this change is necessary.

          shivani.gupta Shivani Gupta added a comment - Graham Pople  question for you: Today do you create the 1024 ATRs upfront or the first time a transaction hits its first document on a given vbucket? Because when I run small experimental tests I do not see 1024 ATRs right away (which is a relief actually!). I agree with your concern that if we go with 20 *1024 ATRs, they will flood the users bucket. Until we have a way of filtering out ATRs into a special System Collection I would not create tons of them. Also, given that with Durability level = majority, the bottleneck is elsewhere rather on ATR contention, I am less convinced this change is necessary.

          People

            graham.pople Graham Pople
            sharath.sulochana Sharath Sulochana (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty