Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-12902

Network partition may lead to dataloss

    XMLWordPrintable

Details

    Description

      Production cluster scenario that lead to data loss:

      • 4 node cluster running on CB 2.5.1.
      • node A, node B, node C failed-over node-D but failover message didn't reach node-D and it doesn't know it was failed over.
      • SDK clients unfortunately were connecting to node-D for the vb-map config(which is incorrect).
      • At this stage a set of vbuckets are active in 2 different places and once rebalance is initiated mutations mapping to active vbuckets on node-D are very likely going to get lost.

      Sample data from logs to back it up:

      World according to 10.100.0.46(node-D, which was failed over):

      ```
      {incoming_replications_conf_hashes,
      [{"services.z3",
      [

      {'ns_1@10.100.0.41',115455469},
      {'ns_1@10.100.0.42',74343578},
      {'ns_1@10.100.0.43',66117516}]},
      {"default",
      [{'ns_1@10.100.0.41',115455469}

      ,

      {'ns_1@10.100.0.42',74343578}

      ,

      {'ns_1@10.100.0.43',66117516}

      ]},
      {"indigo-session",
      [

      {'ns_1@10.100.0.41',124324038},
      {'ns_1@10.100.0.42',42364148},
      {'ns_1@10.100.0.43',122578952}]},
      {"services.z2",
      [{'ns_1@10.100.0.41',124324038}

      ,

      {'ns_1@10.100.0.42',42364148},
      {'ns_1@10.100.0.43',122578952}]},
      {"indigo",
      [{'ns_1@10.100.0.41',48623555},
      {'ns_1@10.100.0.42',124811468},
      {'ns_1@10.100.0.43',104172491}]},
      {"services",
      [{'ns_1@10.100.0.41',124324038},
      {'ns_1@10.100.0.42',42364148}

      ,

      {'ns_1@10.100.0.43',122578952}

      ]}]},
      ```

      World according to 10.100.0.42(is similar for node-A, node-B and node-C)

      ```
      {incoming_replications_conf_hashes,
      [{"services.z3",
      [

      {'ns_1@10.100.0.41',63506121},{'ns_1@10.100.0.43',117606290}]},
      {"default",
      [{'ns_1@10.100.0.41',63506121}

      ,

      {'ns_1@10.100.0.43',117606290}

      ]},
      {"indigo-session",
      [

      {'ns_1@10.100.0.41',101019889},{'ns_1@10.100.0.43',114028661}]},
      {"services.z2",
      [{'ns_1@10.100.0.41',101019889}

      ,

      {'ns_1@10.100.0.43',114028661}]},
      {"indigo",
      [{'ns_1@10.100.0.41',51937030},{'ns_1@10.100.0.43',131953060}]},
      {"services",
      [{'ns_1@10.100.0.41',101019889},{'ns_1@10.100.0.43',114028661}

      ]}]},
      ```

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            dfinlay Dave Finlay
            asingh Abhishek Singh (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty