Details
-
Bug
-
Resolution: Cannot Reproduce
-
Critical
-
None
-
7.1.0
-
Untriaged
-
1
-
Unknown
Description
Build: 7.1.0-1372
Test: -test tests/fts/cheshire-cat/test_fts_clusterops_cheshire_cat_coll_crud_freetier.yml -scope tests/fts/cheshire-cat/scope_fts_cheshire_cat_free_tier.yml
This is a test for testing the scalability of FTS to 1000 indexes. This test is run on AWS on 8 core 64 GB RAM boxes.
Steps :
- Cluster with 3 nodes having kv,n1ql, search, index on all the nodes
- Create 1 bucket, 100 scopes and 50 collections in each scopes
- Created 500 FTS indexes: one index (1 partition) on each collection
- Create 2500 GSI indexes ( 5 on each collection)
- Load documents on some of the collections
- Run fts query workload
- Kill fts on 172.23.106.242 and wait for 25 mins
- Rebalance in a new node with fts service. This was stuck for a long time and had to be manually aborted. (see https://issues.couchbase.com/browse/MB-48514?focusedCommentId=548141&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-548141)
- Kill fts on 172.23.106.243 and wait for 15 mins
- Mutate and add some more docs
- Rebalance out 172.31.63.153 (data+index+query+fts node)
This rebalance step started at 2021-09-30T15:46:15 UTC.
[2021-09-30T15:46:15Z, sequoiatools/couchbase-cli:7.0:cc35fd] rebalance -c 172.31.56.155:8091 --server-remove 172.31.63.153 -u Administrator -p password
|
This rebalance step is stuck at the data service phase, with 82% progress (560 out of 682 vbuckets for bucket1) for 3+ hrs.
Due to MB-48649 I could not upload logs to S3. The logs are copied over to 172.23.120.160 and are at /root/fts_500_logs/eagleeye_iteration36.tar.
To have the test move forward, I will abort the rebalance shortly.
Attachments
Issue Links
- relates to
-
MB-49006 Prioritise backfills for replication DCP streams
- Open