Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-7606

Docs: Failover introduction could use cleaning up

    Details

      Description

      This link: http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-admin-tasks-failover.html, needs a bit of cleaning up/rewording/reorganizing.

      I'd rather not go through line-by-line, but will if need be. It would be good for someone with fresh eyes to take a look and see if it really makes sense to someone new who is reading it...

      Some things I think need cleaning:
      -The notes/best practices are in rather awkward places and not really in-line with what they're referring to
      -Perf doesn't degrade after a failover
      -"If the rebalance" should be "If the failover" in the first bullet

      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        kzeller kzeller added a comment -

        Hi,

        I think this sentence is from you and MC put it into the guide:

        "If, however, everything around Couchbase Server and across the various nodes is healthy and that it does indeed look like a single node problem, and that the aggregate traffic can support loading the remaining nodes with all traffic, then the management system may fail the system over using the REST API or command-line tools."

        Here is the context:

        http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-admin-tasks-failover-choosing.html

        This is confusing. What are are you trying to say?

        That monitoring software can failover a node if there is a network issue but the existing cluster can handle the load?

        Thanks,

        Karen

        Show
        kzeller kzeller added a comment - Hi, I think this sentence is from you and MC put it into the guide: "If, however, everything around Couchbase Server and across the various nodes is healthy and that it does indeed look like a single node problem, and that the aggregate traffic can support loading the remaining nodes with all traffic, then the management system may fail the system over using the REST API or command-line tools." Here is the context: http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-admin-tasks-failover-choosing.html This is confusing. What are are you trying to say? That monitoring software can failover a node if there is a network issue but the existing cluster can handle the load? Thanks, Karen
        Hide
        perry Perry Krug added a comment -

        The words there are generally correct, but definitely a but convoluted (that's all MC btw).

        It's not talking about a network error, but rather an actual failure of a node. It's basically trying to say that "if everything else looks okay" go ahead and fail over the node...to avoid recommending a failover if the cluster can't handle it.

        Does that help?

        Show
        perry Perry Krug added a comment - The words there are generally correct, but definitely a but convoluted (that's all MC btw). It's not talking about a network error, but rather an actual failure of a node. It's basically trying to say that "if everything else looks okay" go ahead and fail over the node...to avoid recommending a failover if the cluster can't handle it. Does that help?
        Hide
        kzeller kzeller added a comment -

        I see, that was confusing.... how is this:

        External monitoring

        Another option is to have a system monitoring the cluster via the Management REST API. Such an external system is in a good position to failover nodes because it can take into account system components that are outside the scope of Couchbase Server.

        For example monitoring software can observe that a network switch is failing and that there is a dependency on that switch by the Couchbase cluster. The system can determine that failing Couchbase Server nodes will not help the situation and will therefore not failover the node.

        The monitoring system can also determine that components around Couchbase Server are functioning and that various nodes in the cluster are healthy. If the monitoring system determines the problem is only with a single node and remaining nodes in the cluster can support aggregate traffic, then the system may failover the node using the REST API or command-line tools.

        Show
        kzeller kzeller added a comment - I see, that was confusing.... how is this: External monitoring Another option is to have a system monitoring the cluster via the Management REST API. Such an external system is in a good position to failover nodes because it can take into account system components that are outside the scope of Couchbase Server. For example monitoring software can observe that a network switch is failing and that there is a dependency on that switch by the Couchbase cluster. The system can determine that failing Couchbase Server nodes will not help the situation and will therefore not failover the node. The monitoring system can also determine that components around Couchbase Server are functioning and that various nodes in the cluster are healthy. If the monitoring system determines the problem is only with a single node and remaining nodes in the cluster can support aggregate traffic, then the system may failover the node using the REST API or command-line tools.
        Hide
        perry Perry Krug added a comment -

        That's great Karen, much cleaner.

        Show
        perry Perry Krug added a comment - That's great Karen, much cleaner.
        Hide
        kzeller kzeller added a comment -

        Thanks for the feedback! : )

        Show
        kzeller kzeller added a comment - Thanks for the feedback! : )
        Show
        kzeller kzeller added a comment - Investigated, rewrote/overhauled: - http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-admin-tasks-failover.html - http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-admin-tasks-failover-choosing.html - http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-admin-tasks-failover-automatic.html - http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-admin-tasks-failover-manual.html - http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-admin-tasks-failover-handling.html - http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-admin-tasks-failover-addback.html
        Show
        kzeller kzeller added a comment - Investigated, rewrote/overhauled: - http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-admin-tasks-failover.html - http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-admin-tasks-failover-choosing.html - http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-admin-tasks-failover-automatic.html - http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-admin-tasks-failover-manual.html - http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-admin-tasks-failover-handling.html - http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-admin-tasks-failover-addback.html

          People

          • Assignee:
            kzeller kzeller
            Reporter:
            perry Perry Krug
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes