Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-49735

[Backport MB-46153] Node that fails to join the cluster might destroy all the data in that cluster

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • Unknown

    Description

      If a node fails to join the cluster because of connectivity problems, it is possible that it connects to the cluster later (when connectivity is fixed) and destroys the cluster's config by wining all ns_config merge conflics.

      To repro:
      0) To emulate connectivity problems modify the code (cb_dist:setup), so that dist connections can't be established if a specific key in ets table is set
      1) Start single node cluster, create sample bucket (node1) and disable dist connections establishment using #0
      2) Start another node (node2), and disable dist connections establishment for it as well (using #0)
      3) Try adding node2 to node1, addition will fail with the same reason as in the logs above
      4) Enable dist conneciton establishment on node2
      5) In few seconds sample bucket from node1 disappears

      So broken connectivity and a little bit of bad luck is all that is needed for the problem to happen

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-49735
          # Subject Branch Project Status CR V

          Activity

            Build couchbase-server-7.1.0-1832 contains ns_server commit 71a005f with commit message:
            MB-49735: Fix incorrect return in ns_cluster:perform_actual_join

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.1.0-1832 contains ns_server commit 71a005f with commit message: MB-49735 : Fix incorrect return in ns_cluster:perform_actual_join

            Build couchbase-server-6.6.4-9944 contains ns_server commit 71a005f with commit message:
            MB-49735: Fix incorrect return in ns_cluster:perform_actual_join

            build-team Couchbase Build Team added a comment - Build couchbase-server-6.6.4-9944 contains ns_server commit 71a005f with commit message: MB-49735 : Fix incorrect return in ns_cluster:perform_actual_join

            Build couchbase-server-7.0.3-7018 contains ns_server commit 88f0381 with commit message:
            MB-49735: Merge branch 'mad-hatter' into cheshire-cat

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.0.3-7018 contains ns_server commit 88f0381 with commit message: MB-49735 : Merge branch 'mad-hatter' into cheshire-cat

            Build couchbase-server-7.0.3-7018 contains ns_server commit 71a005f with commit message:
            MB-49735: Fix incorrect return in ns_cluster:perform_actual_join

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.0.3-7018 contains ns_server commit 71a005f with commit message: MB-49735 : Fix incorrect return in ns_cluster:perform_actual_join

            Nirvair Singh Bhinder I think it may, yes

            timofey.barmin Timofey Barmin added a comment - Nirvair Singh Bhinder I think it may, yes

            People

              ritam.sharma Ritam Sharma
              timofey.barmin Timofey Barmin
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  PagerDuty