Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: 7.6.0
Affects Version/s: 7.6.0
Component/s: ns_server
Labels:
Environment:
OS - Debian 10
Initial version - 7.2.0-5325
Upgrade version - 7.6.0-1666

Triage:
Untriaged
Operating System:
Linux x86_64
Link to Log File, atop/blg, CBCollectInfo, Core dump:

Hide
https://cb-jira.s3.us-east-2.amazonaws.com/logs/reb_500_resp/collectinfo-2023-10-19T055433-ns_1%40172.23.106.201.zip
https://cb-jira.s3.us-east-2.amazonaws.com/logs/reb_500_resp/collectinfo-2023-10-19T055433-ns_1%40172.23.106.202.zip
https://cb-jira.s3.us-east-2.amazonaws.com/logs/reb_500_resp/collectinfo-2023-10-19T055433-ns_1%40172.23.106.203.zip

Show
https://cb-jira.s3.us-east-2.amazonaws.com/logs/reb_500_resp/collectinfo-2023-10-19T055433-ns_1%40172.23.106.201.zip https://cb-jira.s3.us-east-2.amazonaws.com/logs/reb_500_resp/collectinfo-2023-10-19T055433-ns_1%40172.23.106.202.zip https://cb-jira.s3.us-east-2.amazonaws.com/logs/reb_500_resp/collectinfo-2023-10-19T055433-ns_1%40172.23.106.203.zip
Story Points:
0
Is this a Regression?:
Yes

Description

Steps:

Install 7.2.0-5325 on 2 nodes and initialise the cluster with those 2 nodes running just the KV service. The nodes used in this test have 8 CPU cores each.
Create 30 Magma buckets of 256MB RAM Quota each.
Load 1000 documents of 1024 bytes each into all the buckets.
Install 7.6.0-1666 with a provisioned profile on a spare node which is not a part of the cluster.
Perform a swap rebalance of the spare node with a node inside the cluster.
Swap Rebalance fails with the reason "error: 500 reason: unknown ["Unexpected server error, request logged."]"
Cb-collect logs were collected and this was observed in http_access.log

[ns_server:error,2023-10-18T22:13:14.943-07:00,ns_1@172.23.106.201:<0.20565.9>:menelaus_util:reply_server_error_before_close:211]Server error during processing: ["web request failed",

                                 {path,"/controller/rebalance"},

                                 {method,'POST'},

                                 {type,error},

                                 {what,

                                  {case_clause,

                                   {not_enough_cores_for_num_buckets,

                                    <<"The following node(s) being added have insufficient cpu cores for the number of buckets already in the cluster: ns_1@172.23.106.203">>}}},

                                 {trace,

                                  [{menelaus_web_cluster,do_handle_rebalance,

4,

                                    [{file,"src/menelaus_web_cluster.erl"},

                                     {line,962}]},

                                   {request_tracker,request,2,

                                    [{file,"src/request_tracker.erl"},

                                     {line,40}]},

According to the bucket guardrail which is enforced in 7.6.0, the maximum number of buckets we can have for an 8 core node = 8 / 0.4 = 20.

We are exceeding the limit in 7.2.0, but since the upgrade to 7.6.0 is happening with a swap rebalance of identical nodes, it shouldn't fail. Guardrails should be enforced for future buckets that are created after the completion of the upgrade.
Another point to mention is that, this is a regression from 7.6.0-1640, the same test passed when being upgraded to 7.6.0-1640, but it fails when the test is run with 7.6.0-1666.
Cb-collect logs have been attached.

Attachments

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Vibhav S P

Reporter:: Vibhav S P

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 19/Oct/23 9:27 AM

Updated:: 30/Oct/23 9:04 AM

Resolved:: 27/Oct/23 2:28 AM

Gerrit Reviews

There are no open Gerrit changes

Show There is 1 closed Gerrit change

Hide There is 1 closed Gerrit change

MB-59216: Only check rebalance guardrails when enabled: Gerrit Review:

[Upgrade] Swap Rebalance exits with the reason 'error: 500 reason: unknown ["Unexpected server error, request logged."]' when upgrading from 7.2.0 -> 7.6.0 with 30 buckets in the cluster

Details

Description

Attachments

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty