Details
-
Bug
-
Resolution: Unresolved
-
Critical
-
7.6.0
-
7.6.0-1970
-
Untriaged
-
0
-
Unknown
Description
Day 3 into the run, and there have been a few rebalance failures. The ones of interest are as follows -
Rebalance 1
[user:error,2024-01-04T05:12:06.433-08:00,ns_1@172.23.97.67:<0.11390.441>:ns_orchestrator:log_rebalance_completion:1661]Rebalance exited with reason {service_rebalance_failed,index, |
{worker_died,
|
{'EXIT',<0.11005.1009>, |
{task_failed,rebalance,
|
{service_error,
|
<<"RestoreShard error :alternateId(13867861711156681046-3-1) already exists (shardId6469598071031833410)">>}}}}}. |
Rebalance 2
[user:error,2024-01-04T19:47:03.584-08:00,ns_1@172.23.97.67:<0.3749.1088>:ns_orchestrator:log_rebalance_completion:1661]Rebalance exited with reason {service_rebalance_failed,index, |
{{badmatch,
|
{error,
|
{bad_nodes,index,get_agent,
|
[{'ns_1@172.23.97.108', |
{exit,
|
{{timeout,
|
{gen_server,call,
|
[<34830.23512.1801>, |
{call,"ServiceAPI.CancelTask", |
#Fun<json_rpc_connection.0.36915653>, |
#{timeout => 60000}}, |
60000]}}, |
{gen_server,call,
|
[{'service_agent-index', |
'ns_1@172.23.97.108'}, |
get_agent,infinity]}}}},
|
{'ns_1@172.23.97.109', |
{exit,
|
{{timeout,
|
{gen_server,call,
|
[<34831.29071.1566>, |
{call,"ServiceAPI.CancelTask", |
#Fun<json_rpc_connection.0.36915653>, |
#{timeout => 60000}}, |
60000]}}, |
{gen_server,call,
|
[{'service_agent-index', |
'ns_1@172.23.97.109'}, |
get_agent,infinity]}}}}]}}},
|
[{service_manager,wait_for_agents,1, |
[{file,"src/service_manager.erl"}, |
{line,165}]}, |
{service_manager,run_op,1, |
[{file,"src/service_manager.erl"}, |
{line,140}]}, |
{proc_lib,init_p,3, |
[{file,"proc_lib.erl"},{line,225}]}]}}. |
Rebalance Operation Id = ed0472d71a0728851d079456da8ec8dd
|
Rebalance 3
Please retry the operation at a later time.">>}}}}}).
|
[user:error,2024-01-04T20:22:48.243-08:00,ns_1@172.23.97.67:<0.3749.1088>:ns_orchestrator:log_rebalance_completion:1661]Rebalance exited with reason {service_rebalance_failed,index, |
{worker_died,
|
{'EXIT',<0.11459.1536>, |
{task_failed,rebalance,
|
{service_error,
|
<<"Collection does not exist or temporarily unavailable for creating new index.Bucket = bucket5 Scope = _default Collection = GNhuSkW2Zi. Please retry the operation at a later time.">>}}}}}. |
Rebalance Operation Id = ba77566605cdbf2830151efca3113705
|
Latest logs ->
Cbcollect logs:
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1704438199/collectinfo-2024-01-05T072606-ns_1%40172.23.105.122.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1704438199/collectinfo-2024-01-05T072606-ns_1%40172.23.106.171.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1704438199/collectinfo-2024-01-05T072606-ns_1%40172.23.106.176.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1704438199/collectinfo-2024-01-05T072606-ns_1%40172.23.106.30.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1704438199/collectinfo-2024-01-05T072606-ns_1%40172.23.96.198.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1704438199/collectinfo-2024-01-05T072606-ns_1%40172.23.96.230.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1704438199/collectinfo-2024-01-05T072606-ns_1%40172.23.96.245.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1704438199/collectinfo-2024-01-05T072606-ns_1%40172.23.97.100.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1704438199/collectinfo-2024-01-05T072606-ns_1%40172.23.97.109.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1704438199/collectinfo-2024-01-05T072606-ns_1%40172.23.97.66.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1704438199/collectinfo-2024-01-05T072606-ns_1%40172.23.97.67.zip
Attachments
Issue Links
- relates to
-
MB-60176 De-couple dropIndex and stream request lock
- Open
Gerrit Reviews
For Gerrit Dashboard: MB-60282 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
203363,1 | MB-60282 Return scope/collection not found error incase of mismatch in UID | unstable | indexing | Status: MERGED | +2 | +1 |