Details
-
Bug
-
Resolution: Fixed
-
Critical
-
7.6.0
-
7.6.0-2167
-
Untriaged
-
0
-
Yes
Description
There have been 2 rebalance failures with a similar reason -
Failure 1 -
[user:error,2024-02-27T10:19:34.598-08:00,ns_1@172.23.97.67:<0.22535.331>:ns_orchestrator:log_rebalance_completion:1661]Rebalance exited with reason {service_rebalance_failed,index, |
{worker_died,
|
{'EXIT',<0.11571.755>, |
{task_failed,rebalance,
|
{service_error,
|
<<"RestoreShard error :shard already exists :/data/@2i/shards/shard14695280024876267862">>}}}}}. |
Failure 2 -
[user:error,2024-02-27T11:10:45.371-08:00,ns_1@172.23.97.67:<0.22535.331>:ns_orchestrator:log_rebalance_completion:1661]Rebalance exited with reason {service_rebalance_failed,index, |
{worker_died,
|
{'EXIT',<0.32326.774>, |
{task_failed,rebalance,
|
{service_error,
|
<<"RestoreShard error :shard already exists :/data/@2i/shards/ |
Panic observed on 108 and 176.
cbcollect ->
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709061302/collectinfo-2024-02-27T192429-ns_1%40172.23.106.176.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709061302/collectinfo-2024-02-27T192429-ns_1%40172.23.106.30.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709061302/collectinfo-2024-02-27T192429-ns_1%40172.23.96.198.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709061302/collectinfo-2024-02-27T192429-ns_1%40172.23.96.230.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709061302/collectinfo-2024-02-27T192429-ns_1%40172.23.96.245.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709061302/collectinfo-2024-02-27T192429-ns_1%40172.23.97.100.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709061302/collectinfo-2024-02-27T192429-ns_1%40172.23.97.108.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709061302/collectinfo-2024-02-27T192429-ns_1%40172.23.97.109.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709061302/collectinfo-2024-02-27T192429-ns_1%40172.23.97.66.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709061302/collectinfo-2024-02-27T192429-ns_1%40172.23.97.67.zip
cbcollect n-1 ->
Cbcollect logs:
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709057345/collectinfo-2024-02-27T181502-ns_1%40172.23.106.176.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709057345/collectinfo-2024-02-27T181502-ns_1%40172.23.106.30.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709057345/collectinfo-2024-02-27T181502-ns_1%40172.23.96.198.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709057345/collectinfo-2024-02-27T181502-ns_1%40172.23.96.230.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709057345/collectinfo-2024-02-27T181502-ns_1%40172.23.96.245.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709057345/collectinfo-2024-02-27T181502-ns_1%40172.23.97.100.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709057345/collectinfo-2024-02-27T181502-ns_1%40172.23.97.66.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709057345/collectinfo-2024-02-27T181502-ns_1%40172.23.97.67.zip
cbcollect n-2 ->
Cbcollect logs:
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709052981/collectinfo-2024-02-27T170905-ns_1%40172.23.106.171.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709052981/collectinfo-2024-02-27T170905-ns_1%40172.23.106.176.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709052981/collectinfo-2024-02-27T170905-ns_1%40172.23.106.30.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709052981/collectinfo-2024-02-27T170905-ns_1%40172.23.96.198.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709052981/collectinfo-2024-02-27T170905-ns_1%40172.23.96.230.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709052981/collectinfo-2024-02-27T170905-ns_1%40172.23.96.245.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709052981/collectinfo-2024-02-27T170905-ns_1%40172.23.97.100.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709052981/collectinfo-2024-02-27T170905-ns_1%40172.23.97.108.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709052981/collectinfo-2024-02-27T170905-ns_1%40172.23.97.66.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709052981/collectinfo-2024-02-27T170905-ns_1%40172.23.97.67.zip
cbcollect n-3 ->
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709048549/collectinfo-2024-02-27T155622-ns_1%40172.23.105.122.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709048549/collectinfo-2024-02-27T155622-ns_1%40172.23.106.171.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709048549/collectinfo-2024-02-27T155622-ns_1%40172.23.106.176.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709048549/collectinfo-2024-02-27T155622-ns_1%40172.23.106.30.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709048549/collectinfo-2024-02-27T155622-ns_1%40172.23.96.198.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709048549/collectinfo-2024-02-27T155622-ns_1%40172.23.96.230.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709048549/collectinfo-2024-02-27T155622-ns_1%40172.23.96.245.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709048549/collectinfo-2024-02-27T155622-ns_1%40172.23.97.100.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709048549/collectinfo-2024-02-27T155622-ns_1%40172.23.97.109.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709048549/collectinfo-2024-02-27T155622-ns_1%40172.23.97.66.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1709048549/collectinfo-2024-02-27T155622-ns_1%40172.23.97.67.zip
Having had a discussion with Varun Velamuri, this does not look like https://issues.couchbase.com/browse/MB-60917.
We have not seen this failure from RC1 - RC6, and seeing this for first time in 2167.
I'll let Varun comment on if it's a regression after RCA, but since we have not seen this issue earlier, QE has marked this as regression.
cc Ritam Sharma