Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-36634

[Documentation]: Successfully acknowledged sync-write is missing from the bucket when rebalance failure is simulated via memcached kill..

    XMLWordPrintable

Details

    Description

      1. Create a 2 node cluster:

      +----------------+----------+--------------+
      | Nodes          | Services | Status       |
      +----------------+----------+--------------+
      | 10.112.180.101 | [u'kv']  | Cluster node |
      | 10.112.180.102 | None     | <--- IN ---  |
      +----------------+----------+--------------+
      

      2. Create bucket:

      http://10.112.180.101:8091/pools/default/buckets with param: replicaIndex=1&maxTTL=0&flushEnabled=1&compressionMode=off&bucketType=membase&name=default&replicaNumber=1&ramQuotaMB=654&threadsNumber=3&evictionPolicy=valueOnly
      

      3. Loaded 100k(test_docs-0:test_docs-99999) docs with durability=majority
      4. Change bucket replica to 2, add 10.112.180.103, remove 10.112.180.102, hit rebalance. Load another 100k(test_docs-100000:test_docs-199999) in parallel
      5. Kill memcahced on 10.112.180.101 when rebalance reaches ~40%. Rebalance failed(Intentionally)
      Data loading is still in progress with expected exceptions.
      6. Restart rebalance. Wait for rebalance finish and it finished properly.
      7. Wait for data loading to finish and retry of all the catch exceptions succeeds.
      8. Validate the data

      Actual result:
      Data validation failed as few keys are missing from which there was success for sync-write
      Missing keys: ['test_docs-130287', 'test_docs-130289', 'test_docs-130282', 'test_docs-130294', 'test_docs-130291']

      Expected Result:
      All the data should be present as all the exceptions were watched and re-inserted.

      In the attached pcap, apply the filter as: couchbase.opaque == 0xe3080000 and see packet number 619311 which is an insert request for key: test_docs-130287. Packet number 619329 is the success response for it.

      But the key is missing from the bucket.

      Note: Pcap is quite big, please apply the filters. I tried to save the filtered packets through wireshark but some issue is coming while doing that so couldn't do it.

      QE Note:

      -t rebalance_new.swaprebalancetests.SwapRebalanceFailedTests.test_failed_swap_rebalance,nodes_init=2,replicas=1,standard_buckets=1,num-swap=1,new_replica=2,percentage_progress=40,GROUP=P0;durability,durability=MAJORITY,skip_cleanup=True -p infra_log_level=debug,log_level=debug -m rest
      

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-36634
          # Subject Branch Project Status CR V

          Activity

            People

              shivani.gupta Shivani Gupta
              ritesh.agarwal Ritesh Agarwal
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty