Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-37294

[Jepsen] Hang during rebalance in DGM scenario while perform graceful failover

    XMLWordPrintable

Details

    Description

      While running the following Jepsen test that performs a graceful failover of a node and then re-adds it into the cluster using delta node recovery. We observed a rebalance hang during the failover stage of the test.
      lein trampoline run test --nodes-file ./nodes --username vagrant --ssh-private-key ./resources/vagrantkey --package /home/couchbase/jenkins/workspace/kv-engine-jepsen-post-commit/install --workload=failover --failover-type=graceful --recovery-type=delta --replicas=2 --no-autofailover --disrupt-count=1 --rate=0 --durability=0:100:0:0 --eviction-policy=value --cas --use-json-docs --doc-padding-size=3072 --hashdump --enable-memcached-debug-log-level --enable-tcp-capture
      Points to note about the test:

      • We in DGM less than 50% resident
      • We have two replicas
      • Each document is about 3MB
      • We're performing Duriabilty Majority writes

      I've also managed to collect core dumps of memcached on each node:
      172.28.128.125=node1
      172.28.128.126=node2
      172.28.128.127=node3
      172.28.128.128=node4

      Build: couchbase-server-enterprise_6.5.1-6007-ubuntu16.04

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            drigby Dave Rigby added a comment -

            Looking at the memcached.log for the node being graceful failed over (.125), I see the following repeated log messages for vb 60 and 61:

            2019-12-17T13:27:13.081769+00:00 INFO 57: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.28.128.125->ns_1@172.28.128.126:default - (vb:61) DcpProducer::addTakeoverStats empty streams list found
            2019-12-17T13:27:13.081908+00:00 WARNING 55: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.28.128.125->ns_1@172.28.128.128:default - (vb:61) ActiveStream::addTakeoverStats: Stream has status StreamDead
            2019-12-17T13:27:13.082702+00:00 INFO 57: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.28.128.125->ns_1@172.28.128.126:default - (vb:60) DcpProducer::addTakeoverStats empty streams list found
            2019-12-17T13:27:13.082812+00:00 WARNING 55: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.28.128.125->ns_1@172.28.128.128:default - (vb:60) ActiveStream::addTakeoverStats: Stream has status StreamDead
            

            These messages repeat every 5s.

            Note that those warnings are printed when requesting stats for an Active (producer) stream which is dead; and returns the following status to ns_server:

                if (!isActive()) {
                    log(spdlog::level::level_enum::warn,
                        "{} "
                        "ActiveStream::addTakeoverStats: Stream has "
                        "status StreamDead",
                        logPrefix);
                    // Return status of does_not_exist to ensure rebalance does not hang.
                    add_casted_stat("status", "does_not_exist", add_stat, cookie);
                    add_casted_stat("estimate", 0, add_stat, cookie);
                    add_casted_stat("backfillRemaining", 0, add_stat, cookie);
                    return;
                }
            

            It's not clear to me why ns_server is repeatedly requesting these stats.

            ns_server - could you please take a look and see why these vBuckets are getting repeatedly polled?

            drigby Dave Rigby added a comment - Looking at the memcached.log for the node being graceful failed over (.125), I see the following repeated log messages for vb 60 and 61: 2019-12-17T13:27:13.081769+00:00 INFO 57: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.28.128.125->ns_1@172.28.128.126:default - (vb:61) DcpProducer::addTakeoverStats empty streams list found 2019-12-17T13:27:13.081908+00:00 WARNING 55: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.28.128.125->ns_1@172.28.128.128:default - (vb:61) ActiveStream::addTakeoverStats: Stream has status StreamDead 2019-12-17T13:27:13.082702+00:00 INFO 57: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.28.128.125->ns_1@172.28.128.126:default - (vb:60) DcpProducer::addTakeoverStats empty streams list found 2019-12-17T13:27:13.082812+00:00 WARNING 55: (default) DCP (Producer) eq_dcpq:replication:ns_1@172.28.128.125->ns_1@172.28.128.128:default - (vb:60) ActiveStream::addTakeoverStats: Stream has status StreamDead These messages repeat every 5s. Note that those warnings are printed when requesting stats for an Active (producer) stream which is dead; and returns the following status to ns_server: if (!isActive()) { log (spdlog::level::level_enum::warn, "{} " "ActiveStream::addTakeoverStats: Stream has " "status StreamDead" , logPrefix); // Return status of does_not_exist to ensure rebalance does not hang. add_casted_stat( "status" , "does_not_exist" , add_stat, cookie); add_casted_stat( "estimate" , 0, add_stat, cookie); add_casted_stat( "backfillRemaining" , 0, add_stat, cookie); return ; } It's not clear to me why ns_server is repeatedly requesting these stats. ns_server - could you please take a look and see why these vBuckets are getting repeatedly polled?
            dfinlay Dave Finlay added a comment -

            Aliaksey Artamonau: would you mind taking a look at this one?

            dfinlay Dave Finlay added a comment - Aliaksey Artamonau : would you mind taking a look at this one?
            owend Daniel Owen added a comment -

            Richard deMellow Could you confirm whether we can get out of the rebalance hang - i.e. stop the the rebalance and restart. Or does it get stuck indefinitely? thanks

            owend Daniel Owen added a comment - Richard deMellow Could you confirm whether we can get out of the rebalance hang - i.e. stop the the rebalance and restart. Or does it get stuck indefinitely? thanks
            owend Daniel Owen added a comment -

            Seeing the following message repeatedly on .128 in ns_server.debug.log

            [rebalance:debug,2019-12-17T12:28:00.607Z,ns_1@172.28.128.128:<0.2308.0>:janitor_agent:do_wait_seqno_persisted:1034]Got etmpfail while waiting for sequence number 24 to persist for vBucket:83. Will retry.
            [rebalance:debug,2019-12-17T12:28:11.355Z,ns_1@172.28.128.128:<0.2520.0>:janitor_agent:do_wait_seqno_persisted:1034]Got etmpfail while waiting for sequence number 4 to persist for vBucket:591. Will retry.
            [rebalance:debug,2019-12-17T12:28:31.695Z,ns_1@172.28.128.128:<0.2308.0>:janitor_agent:do_wait_seqno_persisted:1034]Got etmpfail while waiting for sequence number 24 to persist for vBucket:83. Will retry.
            [rebalance:debug,2019-12-17T12:28:39.749Z,ns_1@172.28.128.128:<0.5213.0>:janitor_agent:do_wait_seqno_persisted:1034]Got etmpfail while waiting for sequence number 21 to persist for vBucket:254. Will retry.
            [rebalance:debug,2019-12-17T12:28:42.683Z,ns_1@172.28.128.128
            

            owend Daniel Owen added a comment - Seeing the following message repeatedly on .128 in ns_server.debug.log [rebalance:debug,2019-12-17T12:28:00.607Z,ns_1@172.28.128.128:<0.2308.0>:janitor_agent:do_wait_seqno_persisted:1034]Got etmpfail while waiting for sequence number 24 to persist for vBucket:83. Will retry. [rebalance:debug,2019-12-17T12:28:11.355Z,ns_1@172.28.128.128:<0.2520.0>:janitor_agent:do_wait_seqno_persisted:1034]Got etmpfail while waiting for sequence number 4 to persist for vBucket:591. Will retry. [rebalance:debug,2019-12-17T12:28:31.695Z,ns_1@172.28.128.128:<0.2308.0>:janitor_agent:do_wait_seqno_persisted:1034]Got etmpfail while waiting for sequence number 24 to persist for vBucket:83. Will retry. [rebalance:debug,2019-12-17T12:28:39.749Z,ns_1@172.28.128.128:<0.5213.0>:janitor_agent:do_wait_seqno_persisted:1034]Got etmpfail while waiting for sequence number 21 to persist for vBucket:254. Will retry. [rebalance:debug,2019-12-17T12:28:42.683Z,ns_1@172.28.128.128
            owend Daniel Owen added a comment -

            In memcached log on .128 correspondingly seeing repeated message:

            2019-12-17T12:28:00.574757+00:00 WARNING (default) Notified the timeout on seqno for vb:83 Check for: 24, Persisted upto: 4, cookie 0x7fa3260f7280
            2019-12-17T12:28:11.230850+00:00 WARNING (default) Notified the timeout on seqno for vb:591 Check for: 4, Persisted upto: 2, cookie 0x7fa3260f8540
            2019-12-17T12:28:31.687138+00:00 WARNING (default) Notified the timeout on seqno for vb:83 Check for: 24, Persisted upto: 4, cookie 0x7fa3260f7280
            2019-12-17T12:28:39.721929+00:00 WARNING (default) Notified the timeout on seqno for vb:254 Check for: 21, Persisted upto: 1, cookie 0x7fa3260e1280
            

            Looks like we are out of memory on .128.

            owend Daniel Owen added a comment - In memcached log on .128 correspondingly seeing repeated message: 2019-12-17T12:28:00.574757+00:00 WARNING (default) Notified the timeout on seqno for vb:83 Check for: 24, Persisted upto: 4, cookie 0x7fa3260f7280 2019-12-17T12:28:11.230850+00:00 WARNING (default) Notified the timeout on seqno for vb:591 Check for: 4, Persisted upto: 2, cookie 0x7fa3260f8540 2019-12-17T12:28:31.687138+00:00 WARNING (default) Notified the timeout on seqno for vb:83 Check for: 24, Persisted upto: 4, cookie 0x7fa3260f7280 2019-12-17T12:28:39.721929+00:00 WARNING (default) Notified the timeout on seqno for vb:254 Check for: 21, Persisted upto: 1, cookie 0x7fa3260e1280 Looks like we are out of memory on .128.

            I didn't look at the logs, but timeouts in waiting for seqno persistence would indeed lead to a rebalance hanging. Please assign back if you would like more input from me.

            Aliaksey Artamonau Aliaksey Artamonau (Inactive) added a comment - I didn't look at the logs, but timeouts in waiting for seqno persistence would indeed lead to a rebalance hanging. Please assign back if you would like more input from me.
            owend Daniel Owen added a comment -

            Thanks Aliaksey Artamonau for the confirmation - I did initially worry it was something to do with http://review.couchbase.org/#/c/114925/ - but now I'm pretty sure its not - I think its just a test issue. Will get Richard deMellow to confirm.

            owend Daniel Owen added a comment - Thanks Aliaksey Artamonau for the confirmation - I did initially worry it was something to do with http://review.couchbase.org/#/c/114925/ - but now I'm pretty sure its not - I think its just a test issue. Will get Richard deMellow to confirm.

            Daniel Owen the rebalance part for the graceful failover seems to hang indefinitely, see we have been sitting about for 2 hours with no progress. When I've stopped the failover and re-started it while we're sending ops we hang. However, after the cluster is in a steady state we're able perform failover.

            richard.demellow Richard deMellow added a comment - Daniel Owen the rebalance part for the graceful failover seems to hang indefinitely, see we have been sitting about for 2 hours with no progress. When I've stopped the failover and re-started it while we're sending ops we hang. However, after the cluster is in a steady state we're able perform failover.
            richard.demellow Richard deMellow added a comment - - edited

            So I've tested this with 512KB documents to reduce the amount of memory we're using which gives us a 83.3% residency and I'm still seen the rebalance for the graceful failover get stuck.
            I also increase the VM's memory quota to 2GB to ensue it had plenty of over head.
            Jepsen test command:
            lein trampoline run test --nodes-file ./nodes --username vagrant --ssh-private-key ./resources/vagrantkey --package ./couchbase-server-enterprise_6.5.1-6007-ubuntu16.04_amd64.deb --workload=failover --failover-type=graceful --recovery-type=delta --replicas=2 --no-autofailover --disrupt-count=1 --rate=0 --durability=0:100:0:0 --eviction-policy=value --cas --use-json-docs --doc-padding-size=512 --skip-teardown &> jepsen-output-1.log
            Failover progress:

            Bucket status:

            Server status:

            I've uploaded the logs of this test run here: test-with-512KB-docs-80%-resident.zip

            richard.demellow Richard deMellow added a comment - - edited So I've tested this with 512KB documents to reduce the amount of memory we're using which gives us a 83.3% residency and I'm still seen the rebalance for the graceful failover get stuck. I also increase the VM's memory quota to 2GB to ensue it had plenty of over head. Jepsen test command: lein trampoline run test --nodes-file ./nodes --username vagrant --ssh-private-key ./resources/vagrantkey --package ./couchbase-server-enterprise_6.5.1-6007-ubuntu16.04_amd64.deb --workload=failover --failover-type=graceful --recovery-type=delta --replicas=2 --no-autofailover --disrupt-count=1 --rate=0 --durability=0:100:0:0 --eviction-policy=value --cas --use-json-docs --doc-padding-size=512 --skip-teardown &> jepsen-output-1.log Failover progress: Bucket status: Server status: I've uploaded the logs of this test run here: test-with-512KB-docs-80%-resident.zip
            owend Daniel Owen added a comment - - edited

            Hi Richard deMellow we still appear to be going well above the high watermark

            Could you try on an earlier build - to see if we see a similar behaviour?

            owend Daniel Owen added a comment - - edited Hi Richard deMellow we still appear to be going well above the high watermark Could you try on an earlier build - to see if we see a similar behaviour?
            richard.demellow Richard deMellow added a comment - - edited

            Steps to reproduce it outside of Jepsen:

            1. setup a 4 node cluster
            2. create a 100MB quota bucket with two replicas
            3. use pillowfight to read/write 30 large documents (about 1.5MB we want the residency ratio to be about 50%)
            4. gracefully failover one node it should get to about 50% of the way though and then hang

            Example pillowfight command:
            cbc-pillowfight -U couchbase://10.143.195.101/test -u Administrator -P password -I 30 -t 10 -v -m 1572863 -M 1572864 -d majority

            richard.demellow Richard deMellow added a comment - - edited Steps to reproduce it outside of Jepsen: setup a 4 node cluster create a 100MB quota bucket with two replicas use pillowfight to read/write 30 large documents (about 1.5MB we want the residency ratio to be about 50%) gracefully failover one node it should get to about 50% of the way though and then hang Example pillowfight command: cbc-pillowfight -U couchbase://10.143.195.101/test -u Administrator -P password -I 30 -t 10 -v -m 1572863 -M 1572864 -d majority
            ritam.sharma Ritam Sharma added a comment -

            Richard, is this also reproduced with smaller doc size with 50% DGM ratio. We will try this tomorrow.

            ritam.sharma Ritam Sharma added a comment - Richard, is this also reproduced with smaller doc size with 50% DGM ratio. We will try this tomorrow.

            Ritam Sharma no we haven't tried it with a higher doc count and lower document size. However, we think it might be less likely to see the hang as we have a hypnosis that this MB might be due to us not expelling items as much as we should and the larger document allows to make it more apparent when we've not expelled an item correctly.

            richard.demellow Richard deMellow added a comment - Ritam Sharma no we haven't tried it with a higher doc count and lower document size. However, we think it might be less likely to see the hang as we have a hypnosis that this MB might be due to us not expelling items as much as we should and the larger document allows to make it more apparent when we've not expelled an item correctly.
            ashwin.govindarajulu Ashwin Govindarajulu added a comment - - edited

            Able to reproduce the issue with the 55 docs of 1.5MB each (Resident ratio: 47%).

            pillowfight -U couchbase://172.23.105.155/default -u Administrator -P password -I 60 -t 10 -v -m 1572863 -M 1572864 -d majority
            
            

            Build used: 6.5.0-4947

            Cluster/bucket status after failover successful state,

            Please find the logs for the same,

            https://cb-jira.s3.us-east-2.amazonaws.com/logs/MB_37294/collectinfo-2019-12-20T091322-ns_1%40172.23.105.155.ziphttps://cb-jira.s3.us-east-2.amazonaws.com/logs/MB_37294/collectinfo-2019-12-20T091322-ns_1%40172.23.105.159.ziphttps://cb-jira.s3.us-east-2.amazonaws.com/logs/MB_37294/collectinfo-2019-12-20T091322-ns_1%40172.23.105.205.ziphttps://cb-jira.s3.us-east-2.amazonaws.com/logs/MB_37294/collectinfo-2019-12-20T091322-ns_1%40172.23.105.206.zip

            ashwin.govindarajulu Ashwin Govindarajulu added a comment - - edited Able to reproduce the issue with the 55 docs of 1.5MB each (Resident ratio: 47%). pillowfight -U couchbase://172.23.105.155/default -u Administrator -P password -I 60 -t 10 -v -m 1572863 -M 1572864 -d majority Build used: 6.5.0-4947 Cluster/bucket status after failover successful state, Please find the logs for the same, https://cb-jira.s3.us-east-2.amazonaws.com/logs/MB_37294/collectinfo-2019-12-20T091322-ns_1%40172.23.105.155.ziphttps://cb-jira.s3.us-east-2.amazonaws.com/logs/MB_37294/collectinfo-2019-12-20T091322-ns_1%40172.23.105.159.ziphttps://cb-jira.s3.us-east-2.amazonaws.com/logs/MB_37294/collectinfo-2019-12-20T091322-ns_1%40172.23.105.205.ziphttps://cb-jira.s3.us-east-2.amazonaws.com/logs/MB_37294/collectinfo-2019-12-20T091322-ns_1%40172.23.105.206.zip

            Ashwin Govindarajulu Do you know the build number where this test has passed? If you can specify that it would be great

            raju Raju Suravarjjala added a comment - Ashwin Govindarajulu Do you know the build number where this test has passed? If you can specify that it would be great
            drigby Dave Rigby added a comment -

            Cause of the problem identified. Fix is a one-line change to the Flusher (revert a line incorrectly changed by http://review.couchbase.org/#/c/117261/ - a performance/efficiency improvement for SyncWrites. Read on for details...

            Summary

            Commit 1f64b646719dacba8aa78b1101647a56ae94bbb8 modified the Flusher to use VBReadyQueue to manage the low-priority VBuckets waiting to be
            flushed. However, this change introduced a starvation issue for low-priority vBuckets if there are outstanding high-priority vBuckets which are still awaiting seqno_persistence.

            Details

            Consider the following scenario:

            1. At least one SEQNO_PERSISTENCE request is outstanding - VBucket::hpVBReqs is non-empty for at least one vBucket.
            2. This vbucket does not yet have the seqno requested - for example it's a replica vBucket and memory usage is high and replication to it has been throttled.
            3. At least one other (low priority) vBucket is awaiting flushing.

            The consequence is that the low-priority vBucket will never get flushed (not until the high priority vBucket completes its seqno_persistence). This can lead to livelock - if we actually could flush the low-priority vBucket(s) then that would allow CheckpointMemory to be freed (expelling / removing closed, unreferenced checkpoints).

            The actual problem is the logic in Flusher::flushVB(). This is a (needlessly?) complex function, but the high level logic involves switching between two modes (Flusher::doHighPriority):

            A) While there are no outstanding SEQNO_PERSISTENCE vBuckets for this shard, flush the next vBucket in lpVbs (low priority). Reschedule
            (return true) if there's any more low-priority VBs to flush.

            B) If there are outstanding SEQNO_PERSISTENCE vBuckets for this shard:

            1. If hpVBs (high priority vbs) is empty, populate the hpVBs queue with all vBuckets with outstanding SEQNO_PERSISTENCE requests.
            2. Flush the next vb from hpVBs, Reschdule (return true) if there are any more items in hpVBs.
            3. Once all outstanding SEQNO_PERSISTENCE vBuckets have been flushed, allow an equal number of low-priority vBuckets to be flushed before retrying the outstanding SEQNO_PERSISTENCE vBuckets.

            Step B.3 is crucial to avoid starvation of the low-priority queue - without this step, then a single slow VBucket with an outstanding SEQNO_PERSISTENCE request can prevent all other vBuckets in the shard from being flushed.

            However, when the aforementioned path to use VBReadyQueue was introduced, it inadvertently prevented step B.3 from actually occurring.

            This is because when re-entering Flusher::flushVB after flushing the last high priority VB from hpVBs (i.e. after step B.2), we incorrectly only check if hpVbs.empty() is true, and if so set doHighPriority to false - i.e. going back to mode A.

            The fix is restore the logic from before the patch - only switch back to mode A (doHighPriority=false) when both low and high priority VB queues are empty.

            drigby Dave Rigby added a comment - Cause of the problem identified. Fix is a one-line change to the Flusher (revert a line incorrectly changed by http://review.couchbase.org/#/c/117261/ - a performance/efficiency improvement for SyncWrites. Read on for details... — Summary Commit 1f64b646719dacba8aa78b1101647a56ae94bbb8 modified the Flusher to use VBReadyQueue to manage the low-priority VBuckets waiting to be flushed. However, this change introduced a starvation issue for low-priority vBuckets if there are outstanding high-priority vBuckets which are still awaiting seqno_persistence. Details Consider the following scenario: At least one SEQNO_PERSISTENCE request is outstanding - VBucket::hpVBReqs is non-empty for at least one vBucket. This vbucket does not yet have the seqno requested - for example it's a replica vBucket and memory usage is high and replication to it has been throttled. At least one other (low priority) vBucket is awaiting flushing. The consequence is that the low-priority vBucket will never get flushed (not until the high priority vBucket completes its seqno_persistence). This can lead to livelock - if we actually could flush the low-priority vBucket(s) then that would allow CheckpointMemory to be freed (expelling / removing closed, unreferenced checkpoints). The actual problem is the logic in Flusher::flushVB() . This is a (needlessly?) complex function, but the high level logic involves switching between two modes (Flusher::doHighPriority): A) While there are no outstanding SEQNO_PERSISTENCE vBuckets for this shard, flush the next vBucket in lpVbs (low priority). Reschedule (return true) if there's any more low-priority VBs to flush. B) If there are outstanding SEQNO_PERSISTENCE vBuckets for this shard: If hpVBs (high priority vbs) is empty, populate the hpVBs queue with all vBuckets with outstanding SEQNO_PERSISTENCE requests. Flush the next vb from hpVBs, Reschdule (return true) if there are any more items in hpVBs. Once all outstanding SEQNO_PERSISTENCE vBuckets have been flushed, allow an equal number of low-priority vBuckets to be flushed before retrying the outstanding SEQNO_PERSISTENCE vBuckets. Step B.3 is crucial to avoid starvation of the low-priority queue - without this step, then a single slow VBucket with an outstanding SEQNO_PERSISTENCE request can prevent all other vBuckets in the shard from being flushed. However, when the aforementioned path to use VBReadyQueue was introduced, it inadvertently prevented step B.3 from actually occurring. This is because when re-entering Flusher::flushVB after flushing the last high priority VB from hpVBs (i.e. after step B.2), we incorrectly only check if hpVbs.empty() is true, and if so set doHighPriority to false - i.e. going back to mode A. The fix is restore the logic from before the patch - only switch back to mode A (doHighPriority=false) when both low and high priority VB queues are empty.
            drigby Dave Rigby added a comment -

            Note that if the above Flusher bug has been hit, you can confirm it by looking for large amounts of memory in one or more CheckpointManagers, and then seeing if the persistence cursor has not advanced to the last item in the checkpoint for a prolonged period of time (i.e. within a few seconds on most environments):

            1. Look for VBuckets with large Checkpoint(s):

              cbstats localhost:11210 -u Administrator -p password -b default checkpoint | grep :mem_usage | sort -k2 -n
              

            2. For any large-looking vBuckets, confirm that there are outstanding queued items and what the high seqno is:

              cbstats localhost:11210 -u Administrator -p password -b default vbucket-details <VBID> | grep  "high_seqno\|queue_size"
              

            3. Confirm that the persistenceCursor hasn't advanced to the high seqno:

              cbstats localhost:11210 -u Administrator -p password -b default checkpoint <VBID> | grep "persistence:cursor"
              

            drigby Dave Rigby added a comment - Note that if the above Flusher bug has been hit, you can confirm it by looking for large amounts of memory in one or more CheckpointManagers, and then seeing if the persistence cursor has not advanced to the last item in the checkpoint for a prolonged period of time (i.e. within a few seconds on most environments): Look for VBuckets with large Checkpoint(s): cbstats localhost:11210 -u Administrator -p password -b default checkpoint | grep :mem_usage | sort -k2 -n For any large-looking vBuckets, confirm that there are outstanding queued items and what the high seqno is: cbstats localhost:11210 -u Administrator -p password -b default vbucket-details <VBID> | grep "high_seqno\|queue_size" Confirm that the persistenceCursor hasn't advanced to the high seqno: cbstats localhost:11210 -u Administrator -p password -b default checkpoint <VBID> | grep "persistence:cursor"
            ritam.sharma Ritam Sharma added a comment -

            Dave Rigby - Why do we see it for large document and small bucket size. We tried smaller document size but unable to reproduce it, on a personal note I could not reproduce it with steps above.

            ritam.sharma Ritam Sharma added a comment - Dave Rigby - Why do we see it for large document and small bucket size. We tried smaller document size but unable to reproduce it, on a personal note I could not reproduce it with steps above.
            drigby Dave Rigby added a comment -

            Dave Rigby - Why do we see it for large document and small bucket size. We tried smaller document size but unable to reproduce it, on a personal note I could not reproduce it with steps above.

            The cluster needs to be sufficiently above the high watermark that replication is paused (99% of bucket quota IIRC). That's easier to hit if you have slow replication / persistence (e.g. in VMs) as you need to have items which cannot be evicted or otherwise freed.

            Daniel Owen Perhaps you could share your config / setup as I believe you had it failing pretty reliably?

            drigby Dave Rigby added a comment - Dave Rigby - Why do we see it for large document and small bucket size. We tried smaller document size but unable to reproduce it, on a personal note I could not reproduce it with steps above. The cluster needs to be sufficiently above the high watermark that replication is paused (99% of bucket quota IIRC). That's easier to hit if you have slow replication / persistence (e.g. in VMs) as you need to have items which cannot be evicted or otherwise freed. Daniel Owen Perhaps you could share your config / setup as I believe you had it failing pretty reliably?
            owend Daniel Owen added a comment -

            The configuration I was using is

            • cluster with 4 nodes, only KV service.
            • create bucket (value only) 2 replica, 100 MB quote

            Then using pillowfight

            ./cbc-pillowfight -U 127.0.0.1:9000/default -u Administrator -P asdasd -I 60 -t 1 -v -m 1572863 -M 1572864 -d majority --rate-limit 10
            

            owend Daniel Owen added a comment - The configuration I was using is cluster with 4 nodes, only KV service. create bucket (value only) 2 replica, 100 MB quote Then using pillowfight ./cbc-pillowfight -U 127.0.0.1:9000/default -u Administrator -P asdasd -I 60 -t 1 -v -m 1572863 -M 1572864 -d majority --rate-limit 10

            Build couchbase-server-6.5.0-4959 contains kv_engine commit 2a368c3 with commit message:
            MB-37294: Avoid starvation of low-pri VBs in Flusher::flushVB()

            build-team Couchbase Build Team added a comment - Build couchbase-server-6.5.0-4959 contains kv_engine commit 2a368c3 with commit message: MB-37294 : Avoid starvation of low-pri VBs in Flusher::flushVB()

            Build couchbase-server-6.5.1-6023 contains kv_engine commit 2a368c3 with commit message:
            MB-37294: Avoid starvation of low-pri VBs in Flusher::flushVB()

            build-team Couchbase Build Team added a comment - Build couchbase-server-6.5.1-6023 contains kv_engine commit 2a368c3 with commit message: MB-37294 : Avoid starvation of low-pri VBs in Flusher::flushVB()

            Not seeing this issue in MH build 6.5.0-4959.

            Closing this ticket.

            ashwin.govindarajulu Ashwin Govindarajulu added a comment - Not seeing this issue in MH build 6.5.0-4959. Closing this ticket.

            Build couchbase-server-7.0.0-1162 contains kv_engine commit 2a368c3 with commit message:
            MB-37294: Avoid starvation of low-pri VBs in Flusher::flushVB()

            build-team Couchbase Build Team added a comment - Build couchbase-server-7.0.0-1162 contains kv_engine commit 2a368c3 with commit message: MB-37294 : Avoid starvation of low-pri VBs in Flusher::flushVB()

            Build couchbase-server-6.6.0-7519 contains kv_engine commit 2a368c3 with commit message:
            MB-37294: Avoid starvation of low-pri VBs in Flusher::flushVB()

            build-team Couchbase Build Team added a comment - Build couchbase-server-6.6.0-7519 contains kv_engine commit 2a368c3 with commit message: MB-37294 : Avoid starvation of low-pri VBs in Flusher::flushVB()

            People

              ashwin.govindarajulu Ashwin Govindarajulu
              richard.demellow Richard deMellow
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                PagerDuty