Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-34261

With durability, doc ops are almost 7 times slower when replicas=1

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 6.5.0
    • 6.5.0
    • couchbase-bucket
    • Enterprise Edition 6.5.0 build 3248

    Description

      Environment:

      • 2 node cluster
      • Default bucket with replicas = 1

      Steps to reproduce:
      Run the attached java program(You may have to change the IPs) on the above mentioned environment. It will insert 1000 docs in a batch of 10, 3 times 3 different durability levels mentioned below

      Results:
      Time for DurabilityLevel = None is 964 ms
      Time for DurabilityLevel = Majority is 6451 ms
      Time for DurabilityLevel = PERSIST_TO_MAJORITY is 8538 ms

      Attachments

        1. App.java
          7 kB
        2. time_comparison
          305 kB

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            Also, Ritesh Agarwal when running my latency and throughput tests, the metrics will perform better at the beginning of the tests. 1000 docs will show better latency. If you run longer tests, the realtime throughout and latency will drop further until it reaches a limit.

            korrigan.clark Korrigan Clark added a comment - Also, Ritesh Agarwal when running my latency and throughput tests, the metrics will perform better at the beginning of the tests. 1000 docs will show better latency. If you run longer tests, the realtime throughout and latency will drop further until it reaches a limit.

            For a 4 nodes cluster, replica = 2
            Majority = (2 + 1)/2 + 1(Active) = 2 nodes

            1. DurabilityLevel.NONE == PersistTo.NONE, ReplicateTo.NONE
            > according to your comment above they are not the same, how?

            2. DurabilityLevel.MAJORITY == PersistTo.NONE, ReplicateTo.ONE(Active + 1 Replica)

            3. DurabilityLevel.PERSIST_TO_MAJORITY(Does it includes ACTIVE for persistence or any majority nodes is fine?) == PersistTo.TWO(Any two), ReplicateTo.ONE(Active + 1 Replica)

            OR

            DurabilityLevel.PERSIST_TO_MAJORITY = PersistTo.ACTIVE + PersistTo.ONE, ReplicateTo.ONE(Active + 1 Replica)

            4. DurabilityLevel.MAJORITY_AND_PERSIST_ON_MASTER == PersistTo.ACTIVE, ReplicateTo.ONE(Active + 1 Replica)

            Dave Rigby: Can you please confirm on this?

            ritesh.agarwal Ritesh Agarwal added a comment - For a 4 nodes cluster, replica = 2 Majority = (2 + 1)/2 + 1(Active) = 2 nodes 1. DurabilityLevel.NONE == PersistTo.NONE, ReplicateTo.NONE > according to your comment above they are not the same, how? 2. DurabilityLevel.MAJORITY == PersistTo.NONE, ReplicateTo.ONE(Active + 1 Replica) 3. DurabilityLevel.PERSIST_TO_MAJORITY(Does it includes ACTIVE for persistence or any majority nodes is fine?) == PersistTo.TWO(Any two), ReplicateTo.ONE(Active + 1 Replica) OR DurabilityLevel.PERSIST_TO_MAJORITY = PersistTo.ACTIVE + PersistTo.ONE, ReplicateTo.ONE(Active + 1 Replica) 4. DurabilityLevel.MAJORITY_AND_PERSIST_ON_MASTER == PersistTo.ACTIVE, ReplicateTo.ONE(Active + 1 Replica) Dave Rigby : Can you please confirm on this?
            drigby Dave Rigby added a comment -

            1. DurabilityLevel.NONE == PersistTo.NONE, ReplicateTo.NONE
            > according to your comment above they are not the same, how?

            They should be both logically the same and physically encoded on the wire the same, so there should be no difference between them. Any differences in numbers we see should be investigated as that's not expected. There was some suggestion that in the test above, given we run one one durability setting before the other there could be some difference in JVM warmup / cluster warmup but that's not yet clear.

            DurabilityLevel.PERSIST_TO_MAJORITY = PersistTo.ACTIVE + PersistTo.ONE, ReplicateTo.ONE(Active + 1 Replica)

            This one. All the Sync Replication levels must always include the current active node as one of the "majority" set of nodes.

            drigby Dave Rigby added a comment - 1. DurabilityLevel.NONE == PersistTo.NONE, ReplicateTo.NONE > according to your comment above they are not the same, how? They should be both logically the same and physically encoded on the wire the same, so there should be no difference between them. Any differences in numbers we see should be investigated as that's not expected. There was some suggestion that in the test above, given we run one one durability setting before the other there could be some difference in JVM warmup / cluster warmup but that's not yet clear. DurabilityLevel.PERSIST_TO_MAJORITY = PersistTo.ACTIVE + PersistTo.ONE, ReplicateTo.ONE(Active + 1 Replica) This one. All the Sync Replication levels must always include the current active node as one of the "majority" set of nodes.
            wayne Wayne Siu added a comment -

            Ritesh Agarwal

            Do you still see the issue in your tests?  Thanks.

            wayne Wayne Siu added a comment - Ritesh Agarwal Do you still see the issue in your tests?  Thanks.

            Closing the defect based on Korri's performance tests results data in excel sheet.

            ritesh.agarwal Ritesh Agarwal added a comment - Closing the defect based on Korri's performance tests results data in excel sheet.

            People

              ritesh.agarwal Ritesh Agarwal
              ritesh.agarwal Ritesh Agarwal
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty