Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-7382

rebalance froze when node failed over and added back (observed mem used > high water mark for bucket)

    Details

      Description

      • 2 node cluster
      • 2 buckets
      • Bucket 'bkt' had a very high percentage of sets in its front end load.
      • Failed over ec2-54-252-25-132.ap-southeast-2.compute.amazonaws.com, added back, rebalance.
      • Rebalance froze at around 98%.
      • Stopped front end loads, disk write queue drained.
      • Mem used for both nodes, greater than higher water mark.
      • Restarted couchbase server, waited for warm up to complete, retried rebalance, rebalance remained frozen at 50%.
      • Rebooted nodes, waited for warm up to complete, retried rebalance, rebalance remained frozen at 50%.

      Cluster diags:
      1 https://s3.amazonaws.com/bugdb/MB-7382/ec2-54-252-25-132.ap-southeast-2.compute.amazonaws.com-8091-diag.txt.gz

      2 https://s3.amazonaws.com/bugdb/MB-7382/ec2-54-252-20-171.ap-southeast-2.compute.amazonaws.com-8091-diag.txt.gz

      Attached the cbstats of all, raw memory for both nodes.

      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        abhinav Abhinav Dangeti created issue -
        abhinav Abhinav Dangeti made changes -
        Field Original Value New Value
        Description - 2 node cluster
        - Bucket 'bkt' had a very high percentage of sets in its front end load.
        - Failed over ec2-54-252-25-132.ap-southeast-2.compute.amazonaws.com, added back, rebalance.
        - Rebalance froze at around 98%.
        - Stopped front end loads, disk write queue drained.
        - Mem used for both nodes, greater than higher water mark.
        - Restarted couchbase server, waited for warm up to complete, retried rebalance, rebalance remained frozen at 50%.
        - Rebooted nodes, waited for warm up to complete, retried rebalance, rebalance remained frozen at 50%.

        Live cluster:
        http://ec2-54-252-25-132.ap-southeast-2.compute.amazonaws.com:8091/
        http://ec2-54-252-20-171.ap-southeast-2.compute.amazonaws.com:8091/

        Attached the cbstats of all, raw memory for both nodes.
        - 2 node cluster
        - 2 buckets
        - Bucket 'bkt' had a very high percentage of sets in its front end load.
        - Failed over ec2-54-252-25-132.ap-southeast-2.compute.amazonaws.com, added back, rebalance.
        - Rebalance froze at around 98%.
        - Stopped front end loads, disk write queue drained.
        - Mem used for both nodes, greater than higher water mark.
        - Restarted couchbase server, waited for warm up to complete, retried rebalance, rebalance remained frozen at 50%.
        - Rebooted nodes, waited for warm up to complete, retried rebalance, rebalance remained frozen at 50%.

        Live cluster:
        http://ec2-54-252-25-132.ap-southeast-2.compute.amazonaws.com:8091/
        http://ec2-54-252-20-171.ap-southeast-2.compute.amazonaws.com:8091/

        Attached the cbstats of all, raw memory for both nodes.
        trond Trond Norbye made changes -
        Component/s couchbase-bucket [ 10173 ]
        Component/s bucket-engine [ 10010 ]
        chiyoung Chiyoung Seo made changes -
        Assignee Chiyoung Seo [ chiyoung ] Abhinav Dangeti [ abhinav ]
        abhinav Abhinav Dangeti made changes -
        Description - 2 node cluster
        - 2 buckets
        - Bucket 'bkt' had a very high percentage of sets in its front end load.
        - Failed over ec2-54-252-25-132.ap-southeast-2.compute.amazonaws.com, added back, rebalance.
        - Rebalance froze at around 98%.
        - Stopped front end loads, disk write queue drained.
        - Mem used for both nodes, greater than higher water mark.
        - Restarted couchbase server, waited for warm up to complete, retried rebalance, rebalance remained frozen at 50%.
        - Rebooted nodes, waited for warm up to complete, retried rebalance, rebalance remained frozen at 50%.

        Live cluster:
        http://ec2-54-252-25-132.ap-southeast-2.compute.amazonaws.com:8091/
        http://ec2-54-252-20-171.ap-southeast-2.compute.amazonaws.com:8091/

        Attached the cbstats of all, raw memory for both nodes.
        - 2 node cluster
        - 2 buckets
        - Bucket 'bkt' had a very high percentage of sets in its front end load.
        - Failed over ec2-54-252-25-132.ap-southeast-2.compute.amazonaws.com, added back, rebalance.
        - Rebalance froze at around 98%.
        - Stopped front end loads, disk write queue drained.
        - Mem used for both nodes, greater than higher water mark.
        - Restarted couchbase server, waited for warm up to complete, retried rebalance, rebalance remained frozen at 50%.
        - Rebooted nodes, waited for warm up to complete, retried rebalance, rebalance remained frozen at 50%.

        Cluster diags:
        1 http://ec2-54-252-25-132.ap-southeast-2.compute.amazonaws.com:8091/
        https://s3.amazonaws.com/bugdb/MB-7382/ec2-54-252-25-132.ap-southeast-2.compute.amazonaws.com-8091-diag.txt.gz

        2 http://ec2-54-252-20-171.ap-southeast-2.compute.amazonaws.com:8091/
        https://s3.amazonaws.com/bugdb/MB-7382/ec2-54-252-20-171.ap-southeast-2.compute.amazonaws.com-8091-diag.txt.gz

        Attached the cbstats of all, raw memory for both nodes.
        abhinav Abhinav Dangeti made changes -
        Description - 2 node cluster
        - 2 buckets
        - Bucket 'bkt' had a very high percentage of sets in its front end load.
        - Failed over ec2-54-252-25-132.ap-southeast-2.compute.amazonaws.com, added back, rebalance.
        - Rebalance froze at around 98%.
        - Stopped front end loads, disk write queue drained.
        - Mem used for both nodes, greater than higher water mark.
        - Restarted couchbase server, waited for warm up to complete, retried rebalance, rebalance remained frozen at 50%.
        - Rebooted nodes, waited for warm up to complete, retried rebalance, rebalance remained frozen at 50%.

        Cluster diags:
        1 http://ec2-54-252-25-132.ap-southeast-2.compute.amazonaws.com:8091/
        https://s3.amazonaws.com/bugdb/MB-7382/ec2-54-252-25-132.ap-southeast-2.compute.amazonaws.com-8091-diag.txt.gz

        2 http://ec2-54-252-20-171.ap-southeast-2.compute.amazonaws.com:8091/
        https://s3.amazonaws.com/bugdb/MB-7382/ec2-54-252-20-171.ap-southeast-2.compute.amazonaws.com-8091-diag.txt.gz

        Attached the cbstats of all, raw memory for both nodes.
        - 2 node cluster
        - 2 buckets
        - Bucket 'bkt' had a very high percentage of sets in its front end load.
        - Failed over ec2-54-252-25-132.ap-southeast-2.compute.amazonaws.com, added back, rebalance.
        - Rebalance froze at around 98%.
        - Stopped front end loads, disk write queue drained.
        - Mem used for both nodes, greater than higher water mark.
        - Restarted couchbase server, waited for warm up to complete, retried rebalance, rebalance remained frozen at 50%.
        - Rebooted nodes, waited for warm up to complete, retried rebalance, rebalance remained frozen at 50%.

        Cluster diags:
        1 https://s3.amazonaws.com/bugdb/MB-7382/ec2-54-252-25-132.ap-southeast-2.compute.amazonaws.com-8091-diag.txt.gz

        2 https://s3.amazonaws.com/bugdb/MB-7382/ec2-54-252-20-171.ap-southeast-2.compute.amazonaws.com-8091-diag.txt.gz

        Attached the cbstats of all, raw memory for both nodes.
        farshid Farshid Ghods (Inactive) made changes -
        Fix Version/s 2.0.1 [ 10399 ]
        Fix Version/s 2.0 [ 10114 ]
        abhinav Abhinav Dangeti made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Won't Fix [ 2 ]
        mikew Mike Wiederhold made changes -
        Sprint Status Current Sprint
        abhinav Abhinav Dangeti made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            abhinav Abhinav Dangeti
            Reporter:
            abhinav Abhinav Dangeti
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes