Uploaded image for project: 'Couchbase Java Client'
  1. Couchbase Java Client
  2. JCBC-137

During the failure Java client cannot do any operations

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Cannot Reproduce
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Core
    • Security Level: Public
    • Labels:
      None

      Description

      We simulated the split-brain by blocking all network traffic on one node for a minute. We used iptables with DROP target.
      During the failure the client cannot do any operations.
      During this minute the console doesn't even say that any node is down.

      Is this a correct behavior? Can client work with a part of the cluster?

      I see the client has some possible failure modes: http://www.couchbase.com/autodocs/java/spymemcached/2.8.3/index.html?net/spy/memcached/FailureMode.html

      Which is the best to allow client to be functional while one node of the cluster is down?
      I cannot chose from this API docs short description

      Tried all three options for FailureMode.
      Always the same behavior: client cannot do any requests.

      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        ingenthr Matt Ingenthron added a comment -

        Was this with spymemcached directly? If so, was it using moxi, or taught the binary ports and auth?

        What was the workload?

        Show
        ingenthr Matt Ingenthron added a comment - Was this with spymemcached directly? If so, was it using moxi, or taught the binary ports and auth? What was the workload?
        Hide
        dnelubin Denis Nelubin added a comment -

        We're using YCSB. And this client: https://github.com/couchbaselabs/YCSB/blob/master/src/couchbase-1.8/src/main/java/com/yahoo/ycsb/couchbase/CouchbaseClient1_8.java

        The issue doesn't relate to any workload. For example, it happens with YCSB's workload A (50% reads, 50% updates).

        We installed the couchbase-server-community_x86_64_1.8.1.deb downloaded from here: http://www.couchbase.com/download
        It runs moxi processed on the nodes. Does the client connects to them? I don't know.

        The client configuration is:
        couchbase.hosts=r1.local,r2.local,r3.local,r4.local
        couchbase.bucket=test
        couchbase.user=
        couchbase.password=
        couchbase.opTimeout=60000
        #couchbase.failureMode=... tried any
        couchbase.checkOperationStatus=true

        Show
        dnelubin Denis Nelubin added a comment - We're using YCSB. And this client: https://github.com/couchbaselabs/YCSB/blob/master/src/couchbase-1.8/src/main/java/com/yahoo/ycsb/couchbase/CouchbaseClient1_8.java The issue doesn't relate to any workload. For example, it happens with YCSB's workload A (50% reads, 50% updates). We installed the couchbase-server-community_x86_64_1.8.1.deb downloaded from here: http://www.couchbase.com/download It runs moxi processed on the nodes. Does the client connects to them? I don't know. The client configuration is: couchbase.hosts=r1.local,r2.local,r3.local,r4.local couchbase.bucket=test couchbase.user= couchbase.password= couchbase.opTimeout=60000 #couchbase.failureMode=... tried any couchbase.checkOperationStatus=true
        Hide
        ingenthr Matt Ingenthron added a comment -

        My suspicion here is that the the client can do operations, but it appears that all workload stops because all threads are blocking/waiting on a down node. You'd have to look at how YCSB works to determine if that's the case though.

        Take for example, if you have nodes A, B, C, D. Then assume your workload is randomly reading or writing a key. Depending on how the key hashes, it'd go to one of the four nodes. If this random read/write is in a loop, it'd generate a good amount of traffic. Then assume node C fails, so now we have A, B, (C), D. Assuming that loop is waiting until a response comes back from the server, all of the threads will quickly end up blocking and retrying on C, as designed.

        After a failover is initiated, everything should go back to normal.

        This is different than split brain. Split brain would be nodes A & B can see each other and nodes C & D can see each other. The client may see either or both groups. If I understand it correctly, all you did was simulate node failure with the firewall.

        Show
        ingenthr Matt Ingenthron added a comment - My suspicion here is that the the client can do operations, but it appears that all workload stops because all threads are blocking/waiting on a down node. You'd have to look at how YCSB works to determine if that's the case though. Take for example, if you have nodes A, B, C, D. Then assume your workload is randomly reading or writing a key. Depending on how the key hashes, it'd go to one of the four nodes. If this random read/write is in a loop, it'd generate a good amount of traffic. Then assume node C fails, so now we have A, B, (C), D. Assuming that loop is waiting until a response comes back from the server, all of the threads will quickly end up blocking and retrying on C, as designed. After a failover is initiated, everything should go back to normal. This is different than split brain. Split brain would be nodes A & B can see each other and nodes C & D can see each other. The client may see either or both groups. If I understand it correctly, all you did was simulate node failure with the firewall.
        Hide
        dnelubin Denis Nelubin added a comment -

        Ok.
        You're saying that each of my (32) threads comes (in random order) to the unfunctional server and this transaction hangs (and, actually, fails with timeout errors later). This thread block makes transactions to other functional servers impossible.
        Sounds reasonable.

        But theoretically, because I have 1 replica copy, it's possible to do the transaction over the copy of data. This requires some intelligence on client side, to route the requests to another server.
        Is there a fair way to get such functionality with Couchbase? Run client side moxi? What do you recommend?

        Show
        dnelubin Denis Nelubin added a comment - Ok. You're saying that each of my (32) threads comes (in random order) to the unfunctional server and this transaction hangs (and, actually, fails with timeout errors later). This thread block makes transactions to other functional servers impossible. Sounds reasonable. But theoretically, because I have 1 replica copy, it's possible to do the transaction over the copy of data. This requires some intelligence on client side, to route the requests to another server. Is there a fair way to get such functionality with Couchbase? Run client side moxi? What do you recommend?
        Hide
        ingenthr Matt Ingenthron added a comment -

        At the moment, the answer is no. There is a feature we're looking to add, casually called "replica read" which would allow application code to try reading data from a replica in the event the primary location is unavailable, but it's not complete yet. It wouldn't do anything to help with data mutations though.

        It's worth noting that for any distributed database, this is a design decision. It's not just a function of the client library. In the case of Couchbase Server, we've chosen to focus on making primary key access to data consistent. We alleviate the availability concern by making sure we can failover quickly (it's nearly instantaneous) and doing what we can within reason to make autofailover part of the cluster (autofailover can be set up for as low as a 30s period).

        We believe that's the right model for the DB to back a webapp. There are times when you can increase availability at the cost of consistency, but it means the code you write has to be much more defensive. We trade off momentary availability for consistency.

        I'm not sure of your goals, but I don't think running YCSB with a failed node in a cluster is really the right thing to be doing. If you did want to get as many operations through as possible, you may want to consider using asynch operations and having a separate threadpool handling responses. There are a couple of other approaches too, like wrapping the async operations with something that'll give you a callback.

        Show
        ingenthr Matt Ingenthron added a comment - At the moment, the answer is no. There is a feature we're looking to add, casually called "replica read" which would allow application code to try reading data from a replica in the event the primary location is unavailable, but it's not complete yet. It wouldn't do anything to help with data mutations though. It's worth noting that for any distributed database, this is a design decision. It's not just a function of the client library. In the case of Couchbase Server, we've chosen to focus on making primary key access to data consistent. We alleviate the availability concern by making sure we can failover quickly (it's nearly instantaneous) and doing what we can within reason to make autofailover part of the cluster (autofailover can be set up for as low as a 30s period). We believe that's the right model for the DB to back a webapp. There are times when you can increase availability at the cost of consistency, but it means the code you write has to be much more defensive. We trade off momentary availability for consistency. I'm not sure of your goals, but I don't think running YCSB with a failed node in a cluster is really the right thing to be doing. If you did want to get as many operations through as possible, you may want to consider using asynch operations and having a separate threadpool handling responses. There are a couple of other approaches too, like wrapping the async operations with something that'll give you a callback.
        Hide
        ingenthr Matt Ingenthron added a comment -

        Through the discussion, worked out that it's not total lack of availability, but rather that the way the workload is distributed by the app all threads will likely end up blocked at a particular node.

        Show
        ingenthr Matt Ingenthron added a comment - Through the discussion, worked out that it's not total lack of availability, but rather that the way the workload is distributed by the app all threads will likely end up blocked at a particular node.
        Hide
        dnelubin Denis Nelubin added a comment -

        So, for my case: continuously running YCSB workload and down the node - it's not possible to keep the cluster functional?
        But in real world, where actual clients can reconnect to the cluster avoiding non-functional node, or use async requests, it's not a problem?

        But what is the role of replica in Couchbase if clients cannot operate with replicated data when primary is down?

        Show
        dnelubin Denis Nelubin added a comment - So, for my case: continuously running YCSB workload and down the node - it's not possible to keep the cluster functional? But in real world, where actual clients can reconnect to the cluster avoiding non-functional node, or use async requests, it's not a problem? But what is the role of replica in Couchbase if clients cannot operate with replicated data when primary is down?
        Hide
        bengber bengber added a comment -

        It's certainly clear that there's an availability/consistency tradeoff here. But you guys seem to be misunderstanding each other.

        If a node goes down, what is the correct way for the database to fail over? We certainly don't mind having a brief period of unavailability.

        The question is what is the correct way to test failover on Couchbase?

        1. We have 4 nodes in a cluster: A, B, C ,D
        2. Node C goes down (because of network failure, power failure, etc.)
        3. What should happen now?

        What we're experiencing is the entire cluster fails and stays down. That can't possibly be the correct behavior. So we need to know the proper remediation steps. Is there are server side command to bring the cluster back into a usable state? Or is there some client-specific logic we should add?

        Show
        bengber bengber added a comment - It's certainly clear that there's an availability/consistency tradeoff here. But you guys seem to be misunderstanding each other. If a node goes down, what is the correct way for the database to fail over? We certainly don't mind having a brief period of unavailability. The question is what is the correct way to test failover on Couchbase? 1. We have 4 nodes in a cluster: A, B, C ,D 2. Node C goes down (because of network failure, power failure, etc.) 3. What should happen now? What we're experiencing is the entire cluster fails and stays down. That can't possibly be the correct behavior. So we need to know the proper remediation steps. Is there are server side command to bring the cluster back into a usable state? Or is there some client-specific logic we should add?
        Hide
        ingenthr Matt Ingenthron added a comment -

        I think I addressed that in the second paragraph of my reply above. When a node fails, there are two ways to failover and bring the cluster back to a healthy state. Both of these happen immediately when failover is triggered.

        The first is that through the web console (or REST interface) you can trigger failover for data which is primarily available from a given node. This would be done after the administrator decides something is definitely wrong with the given node, and thus he declares a failure.

        The second is that autofailover may be configured to automatically failover if a given node goes down. There is a minimum of 30s to determine if failure has really occurred, but once failure is declared by autofailover, it happens immediately.

        All official Couchbase clients handle this failover, and moxi can handle it for memcached clients which are not cluster aware.

        See the documentation for more info:
        http://www.couchbase.com/docs/couchbase-manual-1.8/couchbase-admin-tasks-failover.html (for 1.8)
        http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-admin-tasks-failover.html (for 2.0)

        Note that it's covered in each manual's architecture and concepts chapter too.

        Show
        ingenthr Matt Ingenthron added a comment - I think I addressed that in the second paragraph of my reply above. When a node fails, there are two ways to failover and bring the cluster back to a healthy state. Both of these happen immediately when failover is triggered. The first is that through the web console (or REST interface) you can trigger failover for data which is primarily available from a given node. This would be done after the administrator decides something is definitely wrong with the given node, and thus he declares a failure. The second is that autofailover may be configured to automatically failover if a given node goes down. There is a minimum of 30s to determine if failure has really occurred, but once failure is declared by autofailover, it happens immediately. All official Couchbase clients handle this failover, and moxi can handle it for memcached clients which are not cluster aware. See the documentation for more info: http://www.couchbase.com/docs/couchbase-manual-1.8/couchbase-admin-tasks-failover.html (for 1.8) http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-admin-tasks-failover.html (for 2.0) Note that it's covered in each manual's architecture and concepts chapter too.

          People

          • Assignee:
            ingenthr Matt Ingenthron
            Reporter:
            pavelpaulau Pavel Paulau
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes