JERROR: Rebalance failed. See logs for detailed reason. You can try again.
2 Occurrences of Rebalance exited with "bucket_cleanup_failed" in quick succession.
On 172.23.97.74 ns_server.debug.log
[rebalance:error,2021-08-26T17:00:35.559-07:00,ns_1@172.23.97.74:<0.26694.1851>:ns_rebalancer:maybe_cleanup_old_buckets:941]Failed to cleanup old buckets on node 'ns_1@172.23.123.26': {badrpc,
{'EXIT',timeout}}
[ns_server:info,2021-08-26T17:00:35.561-07:00,ns_1@172.23.97.74:rebalance_agent<0.22070.0>:rebalance_agent:handle_down:290]Rebalancer process <0.26694.1851> died (reason {buckets_cleanup_failed,
['ns_1@172.23.123.26']}).
[ns_server:debug,2021-08-26T17:00:35.561-07:00,ns_1@172.23.97.74:leader_activities<0.23455.0>:leader_activities:handle_activity_down:505]Activity terminated with reason {shutdown,
[user:error,2021-08-26T17:00:35.562-07:00,ns_1@172.23.97.74:<0.23486.0>:ns_orchestrator:log_rebalance_completion:1416]Rebalance exited with reason {buckets_cleanup_failed,['ns_1@172.23.123.26']}.
Rebalance Operation Id = b06d4c675802ff97cec996a5bcad0a01
[rebalance:error,2021-08-26T17:04:19.613-07:00,ns_1@172.23.97.74:<0.11938.1854>:ns_rebalancer:maybe_cleanup_old_buckets:941]Failed to cleanup old buckets on node 'ns_1@172.23.123.33': {badrpc,
{'EXIT',timeout}}
[rebalance:error,2021-08-26T17:04:19.613-07:00,ns_1@172.23.97.74:<0.11938.1854>:ns_rebalancer:maybe_cleanup_old_buckets:941]Failed to cleanup old buckets on node 'ns_1@172.23.120.77': {badrpc,
{'EXIT',timeout}}
[rebalance:error,2021-08-26T17:04:19.613-07:00,ns_1@172.23.97.74:<0.11938.1854>:ns_rebalancer:maybe_cleanup_old_buckets:941]Failed to cleanup old buckets on node 'ns_1@172.23.120.86': {badrpc,
{'EXIT',timeout}}
[ns_server:info,2021-08-26T17:04:19.614-07:00,ns_1@172.23.97.74:rebalance_agent<0.22070.0>:rebalance_agent:handle_down:290]Rebalancer process <0.11938.1854> died (reason {buckets_cleanup_failed,
['ns_1@172.23.123.33',
'ns_1@172.23.120.77',
'ns_1@172.23.120.86']}).
[ns_server:debug,2021-08-26T17:04:19.614-07:00,ns_1@172.23.97.74:leader_activities<0.23455.0>:leader_activities:handle_activity_down:505]Activity terminated with reason {shutdown,
[user:error,2021-08-26T17:04:19.616-07:00,ns_1@172.23.97.74:<0.23486.0>:ns_orchestrator:log_rebalance_completion:1416]Rebalance exited with reason {buckets_cleanup_failed,
['ns_1@172.23.123.33','ns_1@172.23.120.77',
'ns_1@172.23.120.86']}.
Rebalance Operation Id = 7fc214ef6dfb501e6e6fa4f73923631e
Note, that this is on 7.0.1. It does not have either the changes to chronicle that I made to make it a bit less sensitive to disk latency, or the changes to cbcollect_info that were meant to lower the pressure on disk it creates when logs are collected. The issue occurred when cbcollect_info was running. So I'll close it as a duplicate of MB-46099
The evidence indicating that cluster-wide cbcollect_info collection was ongoing:
Aliaksey Artamonau (Inactive)
added a comment - Note, that this is on 7.0.1. It does not have either the changes to chronicle that I made to make it a bit less sensitive to disk latency, or the changes to cbcollect_info that were meant to lower the pressure on disk it creates when logs are collected. The issue occurred when cbcollect_info was running. So I'll close it as a duplicate of MB-46099
The evidence indicating that cluster-wide cbcollect_info collection was ongoing:
[{type,cluster_logs_collect},
{status,running},
{progress,10},
{timestamp,{{2021,8,26},{23,57,49}}},
{perNode,
[{'ns_1@172.23.106.134',
[{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.106.134.zip"},
{status,startingUpload}]},
{'ns_1@172.23.106.136',
[{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.106.136.zip"},
{status,startingUpload}]},
{'ns_1@172.23.120.58',
[{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.120.58.zip"},
{status,startedUpload},
{url,
"https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1630022267/collectinfo-2021-08-26T235749-ns_1%40172.23.120.58.zip"}]},
{'ns_1@172.23.120.73',
[{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.120.73.zip"},
{status,startingUpload}]},
{'ns_1@172.23.120.74',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.120.74.zip"}]},
{'ns_1@172.23.120.77',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.120.77.zip"}]},
{'ns_1@172.23.120.81',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.120.81.zip"}]},
{'ns_1@172.23.120.86',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.120.86.zip"}]},
{'ns_1@172.23.123.24',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.123.24.zip"}]},
{'ns_1@172.23.123.25',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.123.25.zip"}]},
{'ns_1@172.23.123.26',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.123.26.zip"}]},
{'ns_1@172.23.123.31',
[{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.123.31.zip"},
{status,uploaded},
{url,
"https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1630022267/collectinfo-2021-08-26T235749-ns_1%40172.23.123.31.zip"}]},
{'ns_1@172.23.123.32',
[{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.123.32.zip"},
{status,uploaded},
{url,
"https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1630022267/collectinfo-2021-08-26T235749-ns_1%40172.23.123.32.zip"}]},
{'ns_1@172.23.123.33',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.123.33.zip"}]},
{'ns_1@172.23.96.122',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.96.122.zip"}]},
{'ns_1@172.23.96.14',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.96.14.zip"}]},
{'ns_1@172.23.96.243',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.96.243.zip"}]},
{'ns_1@172.23.96.254',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.96.254.zip"}]},
{'ns_1@172.23.96.48',
[{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.96.48.zip"},
{status,uploaded},
{url,
"https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1630022267/collectinfo-2021-08-26T235749-ns_1%40172.23.96.48.zip"}]},
{'ns_1@172.23.97.105',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.97.105.zip"}]},
{'ns_1@172.23.97.110',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.97.110.zip"}]},
{'ns_1@172.23.97.112',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.97.112.zip"}]},
{'ns_1@172.23.97.148',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.97.148.zip"}]},
{'ns_1@172.23.97.149',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.97.149.zip"}]},
{'ns_1@172.23.97.150',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.97.150.zip"}]},
{'ns_1@172.23.97.151',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.97.151.zip"}]},
{'ns_1@172.23.97.241',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.97.241.zip"}]},
{'ns_1@172.23.97.74',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.97.74.zip"}]}]}]
Note, that this is on 7.0.1. It does not have either the changes to chronicle that I made to make it a bit less sensitive to disk latency, or the changes to cbcollect_info that were meant to lower the pressure on disk it creates when logs are collected. The issue occurred when cbcollect_info was running. So I'll close it as a duplicate of
MB-46099The evidence indicating that cluster-wide cbcollect_info collection was ongoing:
[{type,cluster_logs_collect},
{status,running},
{progress,10},
{timestamp,{{2021,8,26},{23,57,49}}},
{perNode,
[{'ns_1@172.23.106.134',
[{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.106.134.zip"},
{status,startingUpload}]},
{'ns_1@172.23.106.136',
[{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.106.136.zip"},
{status,startingUpload}]},
{'ns_1@172.23.120.58',
[{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.120.58.zip"},
{status,startedUpload},
{url,
"https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1630022267/collectinfo-2021-08-26T235749-ns_1%40172.23.120.58.zip"}]},
{'ns_1@172.23.120.73',
[{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.120.73.zip"},
{status,startingUpload}]},
{'ns_1@172.23.120.74',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.120.74.zip"}]},
{'ns_1@172.23.120.77',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.120.77.zip"}]},
{'ns_1@172.23.120.81',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.120.81.zip"}]},
{'ns_1@172.23.120.86',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.120.86.zip"}]},
{'ns_1@172.23.123.24',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.123.24.zip"}]},
{'ns_1@172.23.123.25',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.123.25.zip"}]},
{'ns_1@172.23.123.26',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.123.26.zip"}]},
{'ns_1@172.23.123.31',
[{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.123.31.zip"},
{status,uploaded},
{url,
"https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1630022267/collectinfo-2021-08-26T235749-ns_1%40172.23.123.31.zip"}]},
{'ns_1@172.23.123.32',
[{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.123.32.zip"},
{status,uploaded},
{url,
"https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1630022267/collectinfo-2021-08-26T235749-ns_1%40172.23.123.32.zip"}]},
{'ns_1@172.23.123.33',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.123.33.zip"}]},
{'ns_1@172.23.96.122',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.96.122.zip"}]},
{'ns_1@172.23.96.14',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.96.14.zip"}]},
{'ns_1@172.23.96.243',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.96.243.zip"}]},
{'ns_1@172.23.96.254',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.96.254.zip"}]},
{'ns_1@172.23.96.48',
[{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.96.48.zip"},
{status,uploaded},
{url,
"https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1630022267/collectinfo-2021-08-26T235749-ns_1%40172.23.96.48.zip"}]},
{'ns_1@172.23.97.105',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.97.105.zip"}]},
{'ns_1@172.23.97.110',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.97.110.zip"}]},
{'ns_1@172.23.97.112',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.97.112.zip"}]},
{'ns_1@172.23.97.148',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.97.148.zip"}]},
{'ns_1@172.23.97.149',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.97.149.zip"}]},
{'ns_1@172.23.97.150',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.97.150.zip"}]},
{'ns_1@172.23.97.151',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.97.151.zip"}]},
{'ns_1@172.23.97.241',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.97.241.zip"}]},
{'ns_1@172.23.97.74',
[{status,started},
{path,
"/opt/couchbase/var/lib/couchbase/tmp/collectinfo-2021-08-26T235749-ns_1@172.23.97.74.zip"}]}]}]