Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-37294

[Jepsen] Hang during rebalance in DGM scenario while perform graceful failover

    XMLWordPrintable

Details

    Description

      While running the following Jepsen test that performs a graceful failover of a node and then re-adds it into the cluster using delta node recovery. We observed a rebalance hang during the failover stage of the test.
      lein trampoline run test --nodes-file ./nodes --username vagrant --ssh-private-key ./resources/vagrantkey --package /home/couchbase/jenkins/workspace/kv-engine-jepsen-post-commit/install --workload=failover --failover-type=graceful --recovery-type=delta --replicas=2 --no-autofailover --disrupt-count=1 --rate=0 --durability=0:100:0:0 --eviction-policy=value --cas --use-json-docs --doc-padding-size=3072 --hashdump --enable-memcached-debug-log-level --enable-tcp-capture
      Points to note about the test:

      • We in DGM less than 50% resident
      • We have two replicas
      • Each document is about 3MB
      • We're performing Duriabilty Majority writes

      I've also managed to collect core dumps of memcached on each node:
      172.28.128.125=node1
      172.28.128.126=node2
      172.28.128.127=node3
      172.28.128.128=node4

      Build: couchbase-server-enterprise_6.5.1-6007-ubuntu16.04

      Attachments

        1. 172.28.128.132-mem-used-512-experiment.png
          81 kB
          Daniel Owen
        2. hang-screen-shot.png
          251 kB
          Richard deMellow
        3. image-2019-12-18-11-30-45-806.png
          182 kB
          Richard deMellow
        4. image-2019-12-18-11-31-01-008.png
          237 kB
          Richard deMellow
        5. image-2019-12-20-14-51-50-994.png
          42 kB
          Ashwin Govindarajulu
        6. image-2019-12-20-14-52-07-154.png
          165 kB
          Ashwin Govindarajulu
        7. jepsen-output-1.log
          18.15 MB
          Richard deMellow
        8. mem-usage-172.28.128.128.png
          76 kB
          Daniel Owen
        9. screenshot-1.png
          70 kB
          Richard deMellow

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            Build couchbase-server-6.5.0-4959 contains kv_engine commit 2a368c3 with commit message:
            MB-37294: Avoid starvation of low-pri VBs in Flusher::flushVB()

            build-team Couchbase Build Team added a comment - Build couchbase-server-6.5.0-4959 contains kv_engine commit 2a368c3 with commit message: MB-37294 : Avoid starvation of low-pri VBs in Flusher::flushVB()

            Build couchbase-server-6.5.1-6023 contains kv_engine commit 2a368c3 with commit message:
            MB-37294: Avoid starvation of low-pri VBs in Flusher::flushVB()

            build-team Couchbase Build Team added a comment - Build couchbase-server-6.5.1-6023 contains kv_engine commit 2a368c3 with commit message: MB-37294 : Avoid starvation of low-pri VBs in Flusher::flushVB()

            Not seeing this issue in MH build 6.5.0-4959.

            Closing this ticket.

            ashwin.govindarajulu Ashwin Govindarajulu added a comment - Not seeing this issue in MH build 6.5.0-4959. Closing this ticket.

            Build couchbase-server-7.0.0-1162 contains kv_engine commit 2a368c3 with commit message:
            MB-37294: Avoid starvation of low-pri VBs in Flusher::flushVB()

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.0.0-1162 contains kv_engine commit 2a368c3 with commit message: MB-37294 : Avoid starvation of low-pri VBs in Flusher::flushVB()

            Build couchbase-server-6.6.0-7519 contains kv_engine commit 2a368c3 with commit message:
            MB-37294: Avoid starvation of low-pri VBs in Flusher::flushVB()

            build-team Couchbase Build Team added a comment - Build couchbase-server-6.6.0-7519 contains kv_engine commit 2a368c3 with commit message: MB-37294 : Avoid starvation of low-pri VBs in Flusher::flushVB()

            People

              ashwin.govindarajulu Ashwin Govindarajulu
              richard.demellow Richard deMellow
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty