Loading...

XML

Word

Printable

Details

Type: Technical task
Resolution: Duplicate
Priority: Critical
Fix Version/s: feature-backlog
Affects Version/s: 2.0
Component/s: couchbase-bucket
Security Level: Public
Labels:
- customer
- supportability

Description

(updated by alk: I cannot fix top posting but I took out some names out from this)

Set up a node, fill the filesystem, watch processes run but see memcached take connections and just fail to respond.

Also, set up a node, stop Couchbase. Fill the filesystem. Start Couchbase.

On Wed, Mar 28, 2012 at 5:07 PM, Sharon Barr <XXXX> wrote:

Unix is more mature then Couchbase at the edge cases. we are getting there.. or trying NOT to get there at all (another alternative..).

From: Matt Ingenthron
Sent: Wednesday, March 28, 2012 8:04 AM
To: Frank Weigel; Perry Krug; Dipti Borkar
Cc: Sharon Barr; Alex Ma; support-internal

Subject: Re: YYYY having issues

Incidentally, while testing the hotfix for AAAA with TMP_OOM, I accidentally ran my CentOS out of disk. The OS is running happily and so are our processes, but moxi is just returning errors and the memcached process isn't responding to stats requests.

There is still free memory available, but happily we've (kinda) lived within our quota. Confusingly the quota is set to 512MByte, but the resident memory size of memcached is only 445MByte. The virtual size is larger, but it's likely not tried to allocate.

So at least this UNIX-like OS is fine when out of disk.

Matt

On 3/27/12 9:55 PM, "Frank Weigel" <XXXXXX> wrote:

In principal agree, but if this is the only disk, UNIX doesn't do well when entirely out of disk AFAIK, so we may need to do this when poor man's disk alert kicks in?

That's a myth. Only buggy UNIXes (or UNIX-like OSs) don't do well there. I've worked with many a UNIX that is perfectly fine with a full disk.*

I agree with Perry that it should end in TMP_OOM. We should leave ourselves some memory of course (since we need to receive the packet to respond with TMP_OOM), but there is no reason why this is not doable. It's simply a matter of writing and testing the software.

Matt

the myth came from BSD that way, way, way back when required 2x the swap possible per process's memory to keep going. that "2x" is another myth that seems to keep perpetuating.

From: Perry Krug <XXX>
Date: Tue, 27 Mar 2012 02:27:01 -0700
To: Frank Weigel <XXX>
Cc: (skipped)
Subject: Re: YYYYY having issues

Can we please actually do something about this in the code so that the entire server doesn't just crash? We should start sending tmp_oom or something as soon as we detect that we are unable to write to disk.

From: Sharon Barr <xxX>
Date: Mon, 26 Mar 2012 17:11:58 -0700
To: Alex Ma <XXX>, Perry Krug <xXX>
Cc: skipped
Subject: RE: YYYYY having issues

Apparently they run out of disk space on all nodes..

Attachments

Issue Links

duplicates

MB-8067 Couchbase should handle running out of disk space gracefully

Open

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Patrick Varley (Inactive)

Reporter:: Steve Yen

Votes:: 2 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 29/Mar/12 12:01 PM

Updated:: 15/Apr/14 4:23 AM

Resolved:: 15/Apr/14 4:23 AM

Gerrit Reviews

There are no open Gerrit changes

behavior when running out of disk space should be TMPOOM

Details

Description

Attachments

Issue Links

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty