Details
-
Bug
-
Resolution: Duplicate
-
Critical
-
7.6.0
-
Untriaged
-
0
-
No
Description
The following test was performed on Capella.
AWS cluster with ami -
CSP - GCP
ami - couchbase-cloud-server-7-6-0-2135-v1-0-28
All of a sudden, all 3 nodes of the cluster randomly got failed over and then got added back to the cluster in a bit..
The following errors were observed multiple times in the logs -
----------------------------------------------------------------------------------------------------------------------------------------------------------
Service 'ns_server' exited with status 137. Restarting. Messages: working as port 1946: Booted. Waiting for shutdown request 1946: Booted. Waiting for shutdown request working as port [os_mon] cpu supervisor port (cpu_sup): Erlang has closed [os_mon] memory supervisor port (memsup): Erlang has closed
----------------------------------------------------------------------------------------------------------------------------------------------------------
{}Compactor for database `sift_bucket` (pid [{type,database}, {important,true}, {name,<<"sift_bucket">>}, {fa, {#Fun<compaction_daemon.4.76759806>, [<<"sift_bucket">>, {config,{_}
_
{30,undefined}_
, {30,undefined}, undefined,false,false, {daemon_config,30,131072, 20971520}}, false, {[
_
{type,bucket}_
]}]}}]) terminated unexpectedly: {compromised_reply, {error, timeout, [{ns_memcached, worker_loop, 3, [
_
{file, "src/ns_memcached.erl"}_
, {line, 253}]}, {proc_lib, init_p_do_apply, 3, [
_
{file, "proc_lib.erl"}_
, {line, 240}]}]}, {gen_server, call, [
_
{'ns_memcached-sift_bucket', 'ns_1@svc-dqisea-node-002.x3w1mh9c1wvwukx.sandbox.nonprod-project-avengers.com'}_
, {raw_stats, <<"diskinfo">>, undefined, #Fun<compaction_daemon.18.76759806>, {<<"0">>, <<"0">>}}, 300000]}}_
----------------------------------------------------------------------------------------------------------------------------------------------------------
server logs -
https://cb-engineering.s3.amazonaws.com/aman/collectinfo-2024-02-16T054818-ns_1%40svc-dqisea-node-001.x3w1mh9c1wvwukx.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/aman/collectinfo-2024-02-16T054818-ns_1%40svc-dqisea-node-002.x3w1mh9c1wvwukx.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/aman/collectinfo-2024-02-16T054818-ns_1%40svc-dqisea-node-003.x3w1mh9c1wvwukx.sandbox.nonprod-project-avengers.com.zip
This sudden failover was observed around 8:19 PM PST.