Uploaded image for project: 'Java Couchbase JVM Core'
  1. Java Couchbase JVM Core
  2. JVMCBC-674

need feedback on durability performance

    XMLWordPrintable

Details

    Description

      Hey Michael,

      Can you take a look at the performance matrix I compiled for durability. Here is the sheet: https://docs.google.com/spreadsheets/d/1B8v4OZneOeGxJwUj226zA3YDr0Y0gjRSVLwy0IAP9qw/edit?usp=sharing . SDK2 and SDK3 columns are for old durability params (replicate to and persist to) and SDK3 New is the new durability levels. There are two issues that I am confused by and need some input to make sure I did the testing correctly. First: For SDK3 New, all durability levels except durabilityLevel=None have the same performance. To me, it does not make sense why majority and persistMajority would perform the same. Also, the performance impact is severe, dropping from 387k to 1k going from None to majority, >99% drop. Second: SDK3 with replicateTo=1 persistTo=0 performs significantly slower than replicateTo=1 persistTo=1 and replicateTo=1 persistTo=2, which implies that adding persist to increases performance and this doesn't really make sense. 

      Here is my YCSB code I am using for the tests, I create a branch called couchbase3-new-durability based on couchbase3 branch: https://github.com/couchbaselabs/YCSB/blob/couchbase3-new-durability/couchbase3/src/main/java/com/yahoo/ycsb/db/couchbase3/Couchbase3Client.java

       

      Here is the set of test files I am using: https://github.com/couchbase/perfrunner/tree/master/tests/durability

       

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            Korrigan Clark can we close this out?

            daschl Michael Nitschinger added a comment - Korrigan Clark can we close this out?

            Michael Nitschinger we just talked but for bookkeeping, here is the psreadsheet with the data relating to number of thread: https://docs.google.com/spreadsheets/d/1B8v4OZneOeGxJwUj226zA3YDr0Y0gjRSVLwy0IAP9qw/edit?usp=sharing 

            If you go to the threads tab at the bottom you should see the different runs.

            korrigan.clark Korrigan Clark added a comment - Michael Nitschinger we just talked but for bookkeeping, here is the psreadsheet with the data relating to number of thread: https://docs.google.com/spreadsheets/d/1B8v4OZneOeGxJwUj226zA3YDr0Y0gjRSVLwy0IAP9qw/edit?usp=sharing   If you go to the threads tab at the bottom you should see the different runs.

            Korrigan Clark can you try with 1 ycsb client and 25 threads? I'm curious at which point it switches.. either it's the number of ycsb clients or the number of threads?

            daschl Michael Nitschinger added a comment - Korrigan Clark can you try with 1 ycsb client and 25 threads? I'm curious at which point it switches.. either it's the number of ycsb clients or the number of threads?

            Michael Nitschinger, I can reproduce your findings only when I use 1 YCSB client with a single thread. However, all of the perf tests use 4 YCSB clients with 25 threads each.

            korrigan.clark Korrigan Clark added a comment - Michael Nitschinger , I can reproduce your findings only when I use 1 YCSB client with a single thread. However, all of the perf tests use 4 YCSB clients with 25 threads each.

            I've run local experiments and ReplicateTo.ONE, PersistTo.NONE is double as fast as ReplicateTo.ONE, PersistTo.ONE in my vagrant setup, so I could not replicate your YCSB finding there.

            • With alpha.4, are you running against a cluster with developer mode enabled (if so, please disable)?

            I think to further see what's going on I think you should try to replicate in a local setup against the cluster.. for example run code like so:

                    Cluster cluster = Cluster.connect(ClusterEnvironment.builder(
                      "10.143.193.101", "Administrator", "password")
                      .serviceConfig(ServiceConfig.keyValueServiceConfig(KeyValueServiceConfig.endpoints(4)))
                      .build());
                    Bucket bucket = cluster.bucket("default");
                    Collection collection = bucket.defaultCollection();
             
                    while (true) {
                        for (int i = 0; i < Integer.MAX_VALUE; i++) {
                           collection.insert("key-"+i, "foobar", InsertOptions
                              .insertOptions()
                              .timeout(Duration.ofSeconds(10))
                            .durability(PersistTo.NONE, ReplicateTo.ONE)
                           //  .durabilityLevel(DurabilityLevel.PERSIST_TO_MAJORITY)
                           );
                        }
                    }
            

            and see if the behavior you see is the same.

            Also, I would strongly recommend testing the same with golang or libcouchbase-based variants to double check it's actually something on the client and not the server.

            daschl Michael Nitschinger added a comment - I've run local experiments and ReplicateTo.ONE, PersistTo.NONE is double as fast as ReplicateTo.ONE, PersistTo.ONE in my vagrant setup, so I could not replicate your YCSB finding there. With alpha.4, are you running against a cluster with developer mode enabled (if so, please disable)? I think to further see what's going on I think you should try to replicate in a local setup against the cluster.. for example run code like so: Cluster cluster = Cluster.connect(ClusterEnvironment.builder( "10.143.193.101", "Administrator", "password") .serviceConfig(ServiceConfig.keyValueServiceConfig(KeyValueServiceConfig.endpoints(4))) .build()); Bucket bucket = cluster.bucket("default"); Collection collection = bucket.defaultCollection();   while (true) { for (int i = 0; i < Integer.MAX_VALUE; i++) { collection.insert("key-"+i, "foobar", InsertOptions .insertOptions() .timeout(Duration.ofSeconds(10)) .durability(PersistTo.NONE, ReplicateTo.ONE) // .durabilityLevel(DurabilityLevel.PERSIST_TO_MAJORITY) ); } } and see if the behavior you see is the same. Also, I would strongly recommend testing the same with golang or libcouchbase-based variants to double check it's actually something on the client and not the server.

            People

              korrigan.clark Korrigan Clark
              korrigan.clark Korrigan Clark
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty