Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-3158

cluster failover/rebalance expected behavior when one or more nodes are out of disk space

    Details

    • Flagged:
      Release Note

      Description

      On or more nodes running out of disk space cannot take down a cluster. Data already stored on a node should stay accessible, only writes to membase buckets should fail.

      Nodes should still be able to be removed from cluster or failed over.

      Attention also needs to be paid to behaviour of other areas that use disk space, such as logs

        Issue Links

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

          frank Frank Weigel created issue -
          sean Sean Lynch (Inactive) made changes -
          Field Original Value New Value
          Assignee Sean Lynch [ sean ]
          sean Sean Lynch (Inactive) made changes -
          Fix Version/s 1.7 [ 10111 ]
          sean Sean Lynch (Inactive) made changes -
          Fix Version/s 1.7 [ 10111 ]
          Hide
          frank Frank Weigel added a comment -

          A Pivotal Tracker story has been created for this Issue: http://www.pivotaltracker.com/story/show/9305789

          Show
          frank Frank Weigel added a comment - A Pivotal Tracker story has been created for this Issue: http://www.pivotaltracker.com/story/show/9305789
          alkondratenko Aleksey Kondratenko (Inactive) made changes -
          Assignee Sean Lynch [ sean ]
          farshid Farshid Ghods (Inactive) made changes -
          Assignee Farshid Ghods [ farshid ]
          Hide
          farshid Farshid Ghods (Inactive) added a comment -

          try this scenario and update before RC

          Show
          farshid Farshid Ghods (Inactive) added a comment - try this scenario and update before RC
          Hide
          farshid Farshid Ghods (Inactive) added a comment -

          when one node runs out of disk space memcached goes into pending mode ) and the user can rebalance this node out from the cluster.

          Shutting down bucket "default" on 'ns_1@172.16.75.128' for server shutdown ns_memcached002 ns_1@172.16.75.128 18:48:24 - Tue May 24, 2011
          Usage of disk "/" on node "172.16.75.128" is over 100%

          if you have two nodes running out of disk space you will not be able to failover those two nodes because failover will timeout.
          the workaround is if you have two or more nodes running out of disk space you need to stop membase server on those two nodes and then you can fail over those nodes.

          Show
          farshid Farshid Ghods (Inactive) added a comment - when one node runs out of disk space memcached goes into pending mode ) and the user can rebalance this node out from the cluster. Shutting down bucket "default" on 'ns_1@172.16.75.128' for server shutdown ns_memcached002 ns_1@172.16.75.128 18:48:24 - Tue May 24, 2011 Usage of disk "/" on node "172.16.75.128" is over 100% if you have two nodes running out of disk space you will not be able to failover those two nodes because failover will timeout. the workaround is if you have two or more nodes running out of disk space you need to stop membase server on those two nodes and then you can fail over those nodes.
          farshid Farshid Ghods (Inactive) made changes -
          Summary Having a node running out of disk space cannot take down a cluster cluster failover/rebalance expected behavior when one or more nodes are out of disk space
          Issue Type Story [ 6 ] Bug [ 1 ]
          Assignee Farshid Ghods [ farshid ]
          Fix Version/s 1.7.1 [ 10183 ]
          Fix Version/s 1.7 GA [ 10111 ]
          Priority Blocker [ 1 ] Major [ 3 ]
          Component/s ns_server [ 10019 ]
          Flagged [Release Note]
          Hide
          perry Perry Krug added a comment -

          Just so long as we understand this this is still a bug. Failover should NEVER time out

          Show
          perry Perry Krug added a comment - Just so long as we understand this this is still a bug. Failover should NEVER time out
          perry Perry Krug made changes -
          Link This issue duplicates MB-3401 [ MB-3401 ]
          farshid Farshid Ghods (Inactive) made changes -
          Labels 1.7.0-release-notes
          farshid Farshid Ghods (Inactive) made changes -
          Labels 1.7.0-release-notes 1.7.0-release-notes 1.7.1-release-notes
          farshid Farshid Ghods (Inactive) made changes -
          Fix Version/s 2.0 Beta [ 10113 ]
          Fix Version/s 1.7.1 [ 10183 ]
          farshid Farshid Ghods (Inactive) made changes -
          Component/s couchbase-bucket [ 10173 ]
          Component/s ep_engine [ 10013 ]
          Hide
          peter peter added a comment -

          Farshid, this can be closed right?

          Show
          peter peter added a comment - Farshid, this can be closed right?
          peter peter made changes -
          Assignee Farshid Ghods [ farshid ]
          peter peter made changes -
          Assignee Farshid Ghods [ farshid ] Dipti Borkar [ dipti ]
          Fix Version/s 2.0 [ 10114 ]
          Fix Version/s 2.0-beta [ 10113 ]
          dipti Dipti Borkar made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Won't Fix [ 2 ]

            People

            • Assignee:
              dipti Dipti Borkar
              Reporter:
              frank Frank Weigel
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Gerrit Reviews

                There are no open Gerrit changes