Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-25061

Support for auto-failover of Index Service - Cluster Mgmt

    XMLWordPrintable

Details

    Description

      It is currently possible to enable automatic failover of nodes running the Index service by toggling an internal setting. Given the improvements we are introducing with Replica Indexes and Rebalancing, should we allow this by default in a future release, without the need to use an internal setting?

      The current logic will allow the failover of an Index node as long as there are two nodes running the service.

      When customers are using Equivalent Indexes, we generally advise them to have multiple copies of Indexes for HA purposes - should we explicitly check that we are not removing the last instance of an Index before failing over the node?

      If customers are using Replica Indexes, will we automatically instantiate a new replica copy at failover time?

      Is there any difference in behaviour between Adhoc and Prepared Queries when an Index node is failed over?

      We have a quota on the number of nodes that can be automatically failed over without some kind of intervention (currently one). Does this quota need to be extended to track/limit the number of nodes with different services on that have been failed over? Any reason why we can't failover (for example) one Data node and one Index node at the same time?

      MB-12740 suggests that we should allow failing over as many nodes as we have bucket replicas - does the same logic apply to the number of Index Replicas?

      PRD - https://docs.google.com/document/d/1QbQ3rWPPUHRsHj_Yf_0x7gYhrsP5KQSgDRl5jZy4_Wo/edit

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            Build couchbase-server-7.1.0-1595 contains ns_server commit 6e7d304 with commit message:
            MB-25061 perform services safety check outside of orchestrator

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.1.0-1595 contains ns_server commit 6e7d304 with commit message: MB-25061 perform services safety check outside of orchestrator

            Build couchbase-server-7.1.0-1595 contains ns_server commit f5db37c with commit message:
            MB-25061 safety check for services

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.1.0-1595 contains ns_server commit f5db37c with commit message: MB-25061 safety check for services

            Build couchbase-server-7.1.0-1646 contains ns_server commit c61913e with commit message:
            MB-25061 correctly pass allow_unsafe flag to failover:is_possible/2

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.1.0-1646 contains ns_server commit c61913e with commit message: MB-25061 correctly pass allow_unsafe flag to failover:is_possible/2
            ritam.sharma Ritam Sharma added a comment -

            Mihir Kamdar - Can you please close the ticket, as test plans are implemented.

            ritam.sharma Ritam Sharma added a comment - Mihir Kamdar - Can you please close the ticket, as test plans are implemented.

            Closing this ticket as the tests for index auto failover have been implemented.

            mihir.kamdar Mihir Kamdar (Inactive) added a comment - Closing this ticket as the tests for index auto failover have been implemented.

            People

              mihir.kamdar Mihir Kamdar (Inactive)
              malarky Chris Malarky
              Votes:
              1 Vote for this issue
              Watchers:
              36 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty