Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-49735

[Backport MB-46153] Node that fails to join the cluster might destroy all the data in that cluster

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • Unknown

    Description

      If a node fails to join the cluster because of connectivity problems, it is possible that it connects to the cluster later (when connectivity is fixed) and destroys the cluster's config by wining all ns_config merge conflics.

      To repro:
      0) To emulate connectivity problems modify the code (cb_dist:setup), so that dist connections can't be established if a specific key in ets table is set
      1) Start single node cluster, create sample bucket (node1) and disable dist connections establishment using #0
      2) Start another node (node2), and disable dist connections establishment for it as well (using #0)
      3) Try adding node2 to node1, addition will fail with the same reason as in the logs above
      4) Enable dist conneciton establishment on node2
      5) In few seconds sample bucket from node1 disappears

      So broken connectivity and a little bit of bad luck is all that is needed for the problem to happen

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              ritam.sharma Ritam Sharma
              timofey.barmin Timofey Barmin
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  PagerDuty