Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Fixed
Priority: Critical
Fix Version/s: 5.1.0
Affects Version/s: 4.6.0, 4.6.1, 4.6.4, 4.6.2, 4.6.3, 5.0.0
Component/s: couchbase-bucket, ns_server
Labels:

Triage:
Untriaged
Flagged:

Release Note
Link to Log File, atop/blg, CBCollectInfo, Core dump:

Hide
https://s3.amazonaws.com/bugdb/jira/sep13/collectinfo-2017-09-13T231925-ns_1%40172.23.105.60.zip
https://s3.amazonaws.com/bugdb/jira/sep13/collectinfo-2017-09-13T231925-ns_1%40172.23.105.61.zip
https://s3.amazonaws.com/bugdb/jira/sep13/collectinfo-2017-09-13T231925-ns_1%40172.23.105.62.zip
https://s3.amazonaws.com/bugdb/jira/sep13/collectinfo-2017-09-13T231925-ns_1%40172.23.105.63.zip
https://s3.amazonaws.com/bugdb/jira/sep13/collectinfo-2017-09-13T231925-ns_1%40172.23.105.83.zip
https://s3.amazonaws.com/bugdb/jira/sep13/collectinfo-2017-09-13T231925-ns_1%40172.23.106.14.zip
https://s3.amazonaws.com/bugdb/jira/sep13/collectinfo-2017-09-13T231925-ns_1%40172.23.106.213.zip
https://s3.amazonaws.com/bugdb/jira/sep13/collectinfo-2017-09-13T231925-ns_1%40172.23.106.96.zip
https://s3.amazonaws.com/bugdb/jira/sep13/collectinfo-2017-09-13T231925-ns_1%40172.23.99.168.zip
https://s3.amazonaws.com/bugdb/jira/sep13/collectinfo-2017-09-13T231925-ns_1%40172.23.99.253.zip

Show
https://s3.amazonaws.com/bugdb/jira/sep13/collectinfo-2017-09-13T231925-ns_1%40172.23.105.60.zip https://s3.amazonaws.com/bugdb/jira/sep13/collectinfo-2017-09-13T231925-ns_1%40172.23.105.61.zip https://s3.amazonaws.com/bugdb/jira/sep13/collectinfo-2017-09-13T231925-ns_1%40172.23.105.62.zip https://s3.amazonaws.com/bugdb/jira/sep13/collectinfo-2017-09-13T231925-ns_1%40172.23.105.63.zip https://s3.amazonaws.com/bugdb/jira/sep13/collectinfo-2017-09-13T231925-ns_1%40172.23.105.83.zip https://s3.amazonaws.com/bugdb/jira/sep13/collectinfo-2017-09-13T231925-ns_1%40172.23.106.14.zip https://s3.amazonaws.com/bugdb/jira/sep13/collectinfo-2017-09-13T231925-ns_1%40172.23.106.213.zip https://s3.amazonaws.com/bugdb/jira/sep13/collectinfo-2017-09-13T231925-ns_1%40172.23.106.96.zip https://s3.amazonaws.com/bugdb/jira/sep13/collectinfo-2017-09-13T231925-ns_1%40172.23.99.168.zip https://s3.amazonaws.com/bugdb/jira/sep13/collectinfo-2017-09-13T231925-ns_1%40172.23.99.253.zip
Is this a Regression?:
Yes

Description

Issue occurred 5 days into longevity test with ephemeral buckets having no eviction policy.

Logs show rebalance started, then we got some metadata overhead warnings followed by ns_server backtrace

2017-09-13T07:34:51.604-07:00, ns_orchestrator:4:info:message(ns_1@172.23.106.14) - Starting rebalance, KeepNodes = ['ns_1@172.23.105.60','ns_1@172.23.105.61',

                                 'ns_1@172.23.105.62','ns_1@172.23.105.63',

                                 'ns_1@172.23.106.14','ns_1@172.23.106.213',

                                 'ns_1@172.23.106.96','ns_1@172.23.99.168',

                                 'ns_1@172.23.99.253'], EjectNodes = ['ns_1@172.23.105.83'], Failed over and being ejected nodes = []; no delta recovery nodes

2017-09-13T07:40:32.197-07:00, ns_vbucket_mover:0:info:message(ns_1@172.23.106.14) - Bucket "default" rebalance appears to be swap rebalance

2017-09-13T08:02:01.695-07:00, menelaus_web_alerts_srv:0:info:message(ns_1@172.23.99.253) - Metadata overhead warning. Over  50% of RAM allocated to bucket  "default" on node "172.23.99.253" is taken up by keys and metadata.

2017-09-13T08:02:22.551-07:00, menelaus_web_alerts_srv:0:info:message(ns_1@172.23.99.253) - Metadata overhead warning. Over  50% of RAM allocated to bucket  "default" on node "172.23.99.253" is taken up by keys and metadata. (repeated 6 times)

per_node_processes('ns_1@172.23.106.14') =

     {<0.32569.4081>,

      [{registered_name,[]},

       {status,waiting},

       {initial_call,{proc_lib,init_p,5}},

       {backtrace,

           [<<"Program counter: 0x00007f460af7b288 (ns_single_vbucket_mover:spawn_and_wait/1 + 72)">>,

            <<"CP: 0x0000000000000000 (invalid)">>,<<"arity = 0">>,<<>>,

            <<"0x00007f4609bdd678 Return addr 0x00007f46533eee90 (misc:try_with_maybe_ignorant_after/2 + 80)">>,

            <<"y(0)     []">>,<<"y(1)     []">>,<<"y(2)     <0.20357.4080>">>,

            <<>>,

            <<"0x00007f4609bdd698 Return addr 0x00007f460af7b0d8 (ns_single_vbucket_mover:mover/5 + 896)">>,

            <<"y(0)     []">>,<<"y(1)     []">>,<<"y(2)     []">>,

            <<"y(3)     []">>,

            <<"y(4)     #Fun<ns_single_vbucket_mover.3.48828051>">>,

            <<"y(5)     Catch 0x00007f46533eeeb0 (misc:try_with_maybe_ignorant_after/2 + 112)">>,

            <<>>,

            <<"0x00007f4609bdd6d0 Return addr 0x00007f465befc198 (proc_lib:init_p_do_apply/3 + 56)">>,

            <<"y(0)     []">>,<<"y(1)     true">>,

            <<"y(2)     ['ns_1@172.23.105.62','ns_1@172.23.106.213']">>,

            <<"y(3)     ['ns_1@172.23.105.62','ns_1@172.23.105.83']">>,

            <<"y(4)     27">>,<<"y(5)     <0.25037.4080>">>,<<>>,

            <<"0x00007f4609bdd708 Return addr 0x0000000000893588 (<terminate process normally>)">>,

            <<"y(0)     Catch 0x00007f465befc1b8 (proc_lib:init_p_do_apply/3 + 88)">>,

            <<>>]},

Result is that rebalance is hanging in the cluster.

Attachments

Issue Links

is caused by

MB-21568 rollback may leave hashtable inconsistent with on-disk data

Closed

relates to

MB-16277 ns_server should set vbucket on future master to pending state during rebalance

Resolved

MB-29381 [BP 4.6.5] vbucket mover crashed if pending vBucket requires rollback

Closed

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Dave Rigby (Inactive)

Reporter:: Tommie McAfee (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 15 Start watching this issue

Dates

Created:: 13/Sep/17 5:14 PM

Updated:: 30/Mar/21 8:55 AM

Resolved:: 23/Apr/18 6:44 AM

Gerrit Reviews

There are no open Gerrit changes

Show There are 3 closed Gerrit changes

Hide There are 3 closed Gerrit changes

MB-26037: Allow DCP rollback on vbuckets in pending state: Gerrit Review:

Merge remote-tracking branch 'couchbase/spock': Gerrit Review:

MB-29381: Allow DCP rollback on vbuckets in pending state: Gerrit Review:

Details

Description

Attachments

Issue Links

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty