Details
-
Bug
-
Resolution: Cannot Reproduce
-
Critical
-
7.2.1
-
Untriaged
-
-
0
-
Unknown
Description
Version: couchbase-cloud-server-7.2.1-5819-v1.0.20 |
Test Scenario : Test code
Steps involved in the test:
- 15 scopes and 60 collections across 3 buckets
- Each bucket -> 10M data 1 index on each collection -> 60 indexes
- Sleep - 15 mins
- Run FTS Queries FTS Flex Queries Random fashion for 2 hrs
- Kill CBFT
- sleep for 15 mins
- Scale out to 5 nodes
- Create 10 more fts indexes with following config :
- 1 Index | 1 Replica | 5 Partitions
- 3 Indexes | 2 Replicas | 6 Partitions
- 6 Indexes | 0 Replicas | 4 Partitions
- Sleep for 15 mins
- Run FTS Queries FTS Flex Queries Random fashion for 30 mins again
- Kill CBFT
- Sleep for 15 mins
- Rebalance/Scale in to 4 nodes
During this we see rebalance exiting for fts service with the following error:
Rebalance exited with reason {service_rebalance_failed,fts,
|
{agent_died,<33557.25272.84>, |
{linked_process_died,<33557.11877.86>, |
{'ns_1@svc-dqs-node-002.wlvswbdogdobi7s6.nonprod-project-avengers.com', |
{{badmatch,
|
{false, |
{topology,[],
|
[<<"7f65b5ad837d8fe4092435313490197b">>, |
<<"8a57b22dcc686d4dcfc14195b7860a00">>, |
<<"925ab73628a041b289606258b7e757c3">>, |
<<"a9eb96044f9eb78083e22a226e9ca2c9">>], |
false, |
[<<"error: nodes: sample res.StatusCode not 200, res: &http.Response{Status:\"503 Service Unavailable\", StatusCode:503, Proto:\"HTTP/1.1\", ProtoMajor:1, ProtoMinor:1, Header:http.Header{\"Content-Length\":[]string{\"50\"}, \"Content-Type\":[]string{\"text/plain; charset=utf-8\"}, \"Date\":[]string{\"Wed, 19 Jul 2023 15:54:37 GMT\"}}, Body:(*http.bodyEOFSignal)(0xc1881c8780), ContentLength:50, TransferEncoding:[]string(nil), Close:false, Uncompressed:false, Trailer:http.Header(nil), Request:(*http.Request)(0xc201f54600), TLS:(*tls.ConnectionState)(0xc1ceed7600)}, urlUUID: monitor.UrlUUID{Url:\"https://svc-dqs-node-003.wlvswbdogdobi7s6.nonprod-project-avengers.com:18094\", UUID:\"925ab73628a041b289606258b7e757c3\"}, kind: /api/stats?partitions=true, err: <nil>">>]}, |
{topology,[],
|
[<<"7f65b5ad837d8fe4092435313490197b">>, |
<<"8a57b22dcc686d4dcfc14195b7860a00">>, |
<<"925ab73628a041b289606258b7e757c3">>, |
<<"a9eb96044f9eb78083e22a226e9ca2c9">>], |
true,[]}}}, |
[{service_agent,long_poll_worker_loop,5, |
[{file,"src/service_agent.erl"}, |
{line,605}]}, |
{proc_lib,init_p,3, |
[{file,"proc_lib.erl"},{line,211}]}]}}}}}. |
Rebalance Operation Id = e221201a218da0abe3d32a31ba87fa31
|
FYI : The disk and CPU utilisation for all nodes seem to be in control and healthy.
Attachments
For Gerrit Dashboard: MB-57947 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
194718,2 | MB-57947: 503 Service Unavailable Toy Build | master | cbgt | Status: ABANDONED | 0 | 0 |
194719,1 | MB-57947: 503 Service Unavailable Toy Build | neo | cbgt | Status: ABANDONED | 0 | 0 |