Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Fixed
Priority: Critical
Fix Version/s: 4.1.0
Affects Version/s: 4.0.0
Component/s: couchbase-bucket
Security Level: Public
Labels:
- performance
Environment:
Sherlock RC4 4.0.0-4047 - This symptom *probably* existed before Sherlock RC1, we only just got to the bottom of triaging this.

Triage:
Untriaged
Operating System:
Centos 64-bit
Is this a Regression?:
Unknown
Sprint:
KV: Sep 14 - Oct 2

Description

Test first loads 100M documents and did a graceful failover. It was fine.

Test then add back the node (.14) and starts rebalance. It didn't complete.

(If I then manually trigger rebalance again, it is fine.)

(Also, if run with 10M documents total, test also passes.)

(The 100M case is very reproducible on Ares.)

REST call to pools/default/tasks got this:

{u'status': u'notRunning', u'statusIsStale': False, u'errorMessage': u'Reba lance failed. See logs for detailed reason. You can try rebalance again.', u'type': u'rebalance', u'masterRequestTimedOu t': False}

Here is some log snippet from the console:

Failed to wait deletion of some buckets on some nodes: [{'ns_1@172.23.96.14',
{'EXIT',

{old_buckets_shutdown_wait_failed, ["bucket-1"]}

}}]

Here is something possibly relevant in the ns_server.debug.log on the .14 node:

[ns_server:error,2015-08-25T00:47:09.552-07:00,ns_1@172.23.96.14:timeout_diag_logger<0.129.0>:timeout_diag_logger:do_diag:105]Got timeout {slow_bucket_stop,{{single_bucket_kv_sup,"bucket-1"},
<0.369.0>,supervisor,
[single_bucket_kv_sup]}}

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

172.23.96.14.zip
6.25 MB
25/Aug/15 11:50 AM
172.23.96.13.zip
6.63 MB
25/Aug/15 11:50 AM
172.23.96.12.zip
6.66 MB
25/Aug/15 11:50 AM
172.23.96.11.zip
6.90 MB
25/Aug/15 11:50 AM

Issue Links

duplicates

MB-15374 [system test] Hard Fail Over -> add back with Full Recovery: Rebalance exited with reason {buckets_shutdown_wait_failed, {old_buckets_shutdown_wait_failed,

Closed

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: David Kao (Inactive)

Reporter:: David Kao (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 25/Aug/15 11:50 AM

Updated:: 05/Dec/15 5:07 PM

Resolved:: 25/Aug/15 3:09 PM

Gerrit Reviews

There are no open Gerrit changes

Show There are 2 closed Gerrit changes

Hide There are 2 closed Gerrit changes

MB-16155: [BP] MB-15374 Cancel all tasks if force flag set during destroy: Gerrit Review:

Merge remote-tracking branch 'couchbase/sherlock': Gerrit Review:

rebalance-in fails to wait for bucket deletion after graceful failover.

Details

Description

Attachments

Attachments

Issue Links

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty