Loading...

Details

Type: Bug
Resolution: Won't Fix
Priority: Critical
Fix Version/s: 7.6.2
Affects Version/s: 7.6.2
Component/s: fts
Labels:
- vector-search
Environment:
7.6.2-3714

Triage:
Untriaged
Link to Log File, atop/blg, CBCollectInfo, Core dump:

Hide
New Snapshot → http://supportal.couchbase.com/snapshot/340956b4ac116624083285d0ffc6ceb8::0

s3://cb-customers-secure/sarthakftsrebalance1/2024-06-15/collectinfo-2024-06-15t074830-ns_1@svc-d-node-001.vejyzmwnlrj3es0y.customsubdomain.nonprod-project-avengers.com-d68f91f97b487f7c.zip
s3://cb-customers-secure/sarthakftsrebalance1/2024-06-15/collectinfo-2024-06-15t074830-ns_1@svc-d-node-002.vejyzmwnlrj3es0y.customsubdomain.nonprod-project-avengers.com-ce1994d95128a651.zip
s3://cb-customers-secure/sarthakftsrebalance1/2024-06-15/collectinfo-2024-06-15t074830-ns_1@svc-d-node-003.vejyzmwnlrj3es0y.customsubdomain.nonprod-project-avengers.com-05b9439d036ebc08.zip
s3://cb-customers-secure/sarthakftsrebalance1/2024-06-15/collectinfo-2024-06-15t074830-ns_1@svc-q-node-009.vejyzmwnlrj3es0y.customsubdomain.nonprod-project-avengers.com-3a81830d5672684f.zip
s3://cb-customers-secure/sarthakftsrebalance1/2024-06-15/collectinfo-2024-06-15t074830-ns_1@svc-q-node-010.vejyzmwnlrj3es0y.customsubdomain.nonprod-project-avengers.com-4f0387f66a7c79bc.zip
s3://cb-customers-secure/sarthakftsrebalance1/2024-06-15/collectinfo-2024-06-15t074830-ns_1@svc-s-node-004.vejyzmwnlrj3es0y.customsubdomain.nonprod-project-avengers.com-1146128358d841a2.zip
s3://cb-customers-secure/sarthakftsrebalance1/2024-06-15/collectinfo-2024-06-15t074830-ns_1@svc-s-node-005.vejyzmwnlrj3es0y.customsubdomain.nonprod-project-avengers.com-b69c20bb30806127.zip
s3://cb-customers-secure/sarthakftsrebalance1/2024-06-15/collectinfo-2024-06-15t074830-ns_1@svc-s-node-006.vejyzmwnlrj3es0y.customsubdomain.nonprod-project-avengers.com-bdf3a1bac0630a5f.zip
s3://cb-customers-secure/sarthakftsrebalance1/2024-06-15/collectinfo-2024-06-15t074830-ns_1@svc-s-node-007.vejyzmwnlrj3es0y.customsubdomain.nonprod-project-avengers.com-9daa581f0b61bf9a.zip
s3://cb-customers-secure/sarthakftsrebalance1/2024-06-15/collectinfo-2024-06-15t074830-ns_1@svc-s-node-008.vejyzmwnlrj3es0y.customsubdomain.nonprod-project-avengers.com-b179e3d755978274.zip

Show
New Snapshot → http://supportal.couchbase.com/snapshot/340956b4ac116624083285d0ffc6ceb8::0 s3://cb-customers-secure/sarthakftsrebalance1/2024-06-15/collectinfo-2024-06-15t074830-ns_1@svc-d-node-001.vejyzmwnlrj3es0y.customsubdomain.nonprod-project-avengers.com-d68f91f97b487f7c.zip s3://cb-customers-secure/sarthakftsrebalance1/2024-06-15/collectinfo-2024-06-15t074830-ns_1@svc-d-node-002.vejyzmwnlrj3es0y.customsubdomain.nonprod-project-avengers.com-ce1994d95128a651.zip s3://cb-customers-secure/sarthakftsrebalance1/2024-06-15/collectinfo-2024-06-15t074830-ns_1@svc-d-node-003.vejyzmwnlrj3es0y.customsubdomain.nonprod-project-avengers.com-05b9439d036ebc08.zip s3://cb-customers-secure/sarthakftsrebalance1/2024-06-15/collectinfo-2024-06-15t074830-ns_1@svc-q-node-009.vejyzmwnlrj3es0y.customsubdomain.nonprod-project-avengers.com-3a81830d5672684f.zip s3://cb-customers-secure/sarthakftsrebalance1/2024-06-15/collectinfo-2024-06-15t074830-ns_1@svc-q-node-010.vejyzmwnlrj3es0y.customsubdomain.nonprod-project-avengers.com-4f0387f66a7c79bc.zip s3://cb-customers-secure/sarthakftsrebalance1/2024-06-15/collectinfo-2024-06-15t074830-ns_1@svc-s-node-004.vejyzmwnlrj3es0y.customsubdomain.nonprod-project-avengers.com-1146128358d841a2.zip s3://cb-customers-secure/sarthakftsrebalance1/2024-06-15/collectinfo-2024-06-15t074830-ns_1@svc-s-node-005.vejyzmwnlrj3es0y.customsubdomain.nonprod-project-avengers.com-b69c20bb30806127.zip s3://cb-customers-secure/sarthakftsrebalance1/2024-06-15/collectinfo-2024-06-15t074830-ns_1@svc-s-node-006.vejyzmwnlrj3es0y.customsubdomain.nonprod-project-avengers.com-bdf3a1bac0630a5f.zip s3://cb-customers-secure/sarthakftsrebalance1/2024-06-15/collectinfo-2024-06-15t074830-ns_1@svc-s-node-007.vejyzmwnlrj3es0y.customsubdomain.nonprod-project-avengers.com-9daa581f0b61bf9a.zip s3://cb-customers-secure/sarthakftsrebalance1/2024-06-15/collectinfo-2024-06-15t074830-ns_1@svc-s-node-008.vejyzmwnlrj3es0y.customsubdomain.nonprod-project-avengers.com-b179e3d755978274.zip
Story Points:
0
Is this a Regression?:
Unknown

Description

Steps to reproduce:

Create a cluster- 5 FTS nodes - 32GB ram and 16vCPU per node
Load ~60M vector data
Create 4 FTS indexes, 80 partition per index ( 1 index partition per vCPU )
Keep constantly inserting vector data ( dcp ingestion remains active for FTS)

Rebalance out 1 node from the cluster. Rebalance fails with ;

7:45:37 AM 15 Jun, 2024

Rebalance exited with reason {service_rebalance_failed,fts,

{worker_died,

{'EXIT',<0.7340.24>,

{task_failed,rebalance,

{service_error,

<<"nodes: sample res.StatusCode not 200, res: &http.Response{Status:\"503 Service Unavailable\", StatusCode:503, Proto:\"HTTP/1.1\", ProtoMajor:1, ProtoMinor:1, Header:http.Header{\"Content-Length\":[]string{\"50\"}, \"Content-Type\":[]string{\"text/plain; charset=utf-8\"}, \"Date\":[]string{\"Sat, 15 Jun 2024 07:45:36 GMT\"}}, Body:(*http.bodyEOFSignal)(0xc2020bd740), ContentLength:50, TransferEncoding:[]string(nil), Close:false, Uncompressed:false, Trailer:http.Header(nil), Request:(*http.Request)(0xc0e5285b00), TLS:(*tls.ConnectionState)(0xc0111d6840)}, urlUUID: monitor.UrlUUID{Url:\"https://svc-s-node-008.vejyzmwnlrj3es0y.customsubdomain.nonprod-project-avengers.com:18094\", UUID:\"2c26104010bd76d7a003ae9aa05c34a6\"}, kind: /api/stats?partitions=true, err: <nil>">>}}}}}.

Rebalance Operation Id = 6d0c2df944dcaac77d21eb72c21450ff

es0y.customsubdomain.nonprod-project-avengers.com

ns_1@svc-q-node-009.vejyzmwnlrj3es0y.customsubdomain.nonprod-project-avengers.com

Stop the ingestion and let the cluster be idle. Retrigger the rebalance, Rebalance succeeds.

During rebalance both CPU and Memory were normal (60% cpu usage and 40% memory usage)

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

image-2024-06-17-18-43-34-876.png
987 kB
17/Jun/24 6:13 AM
image-2024-06-19-15-26-22-659.png
137 kB
19/Jun/24 2:56 AM
image-2024-06-19-16-28-13-270.png
227 kB
19/Jun/24 3:58 AM
image-2024-06-24-13-35-50-005.png
182 kB
24/Jun/24 1:05 AM

Issue Links

relates to

MB-62310 >95% queries erroring out with enough RAM and CPU availability

Open

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Rebalance out operation fails with 503 Service Unavailable

Details

Description

Attachments

Attachments

Issue Links

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty