Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Duplicate
Priority: Critical
Fix Version/s: None
Affects Version/s: 7.2.1
Component/s: couchbase-bucket
Labels:
- zenith

Triage:
Untriaged
Link to Log File, atop/blg, CBCollectInfo, Core dump:

Hide
https://cb-engineering.s3.amazonaws.com/ftssystemtest/collectinfo-2023-08-11T120504-ns_1%40svc-dqs-node-001.lie3v0iv5ulitlp.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/ftssystemtest/collectinfo-2023-08-11T120504-ns_1%40svc-dqs-node-002.lie3v0iv5ulitlp.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/ftssystemtest/collectinfo-2023-08-11T120504-ns_1%40svc-dqs-node-003.lie3v0iv5ulitlp.sandbox.nonprod-project-avengers.com.zip
https://cb-engineering.s3.amazonaws.com/ftssystemtest/collectinfo-2023-08-11T120504-ns_1%40svc-dqs-node-004.lie3v0iv5ulitlp.sandbox.nonprod-project-avengers.com.zip

Show
https://cb-engineering.s3.amazonaws.com/ftssystemtest/collectinfo-2023-08-11T120504-ns_1%40svc-dqs-node-001.lie3v0iv5ulitlp.sandbox.nonprod-project-avengers.com.zip https://cb-engineering.s3.amazonaws.com/ftssystemtest/collectinfo-2023-08-11T120504-ns_1%40svc-dqs-node-002.lie3v0iv5ulitlp.sandbox.nonprod-project-avengers.com.zip https://cb-engineering.s3.amazonaws.com/ftssystemtest/collectinfo-2023-08-11T120504-ns_1%40svc-dqs-node-003.lie3v0iv5ulitlp.sandbox.nonprod-project-avengers.com.zip https://cb-engineering.s3.amazonaws.com/ftssystemtest/collectinfo-2023-08-11T120504-ns_1%40svc-dqs-node-004.lie3v0iv5ulitlp.sandbox.nonprod-project-avengers.com.zip
Story Points:
0
Is this a Regression?:
Unknown

Description

_{Test Steps:}

_{15 scopes and 60 collections across 3 buckets}
- _{Each bucket -> 10M data 1 index on each collection -> 60 indexes (each index ~400k docs)}
_{Sleep - 15 mins}
_{Run FTS Queries FTS Flex Queries Random fashion for 2 hrs}
_{Kill CBFT}
_{sleep for 15 mins}
_{Scale out to 5 nodes}
_{Create 10} _more _{fts indexes with following config on default scope and default index (500k docs per index) :}
- _{1 Index | 1 Replica | 5 Partitions}
- _{3 Indexes | 2 Replicas | 6 Partitions}
- _{6 Indexes | 0 Replicas | 4 Partitions}
_{Sleep for 15 mins}
_{Run FTS Queries FTS Flex Queries Random fashion for 30 mins} _again
_{Kill CBFT}
_{Sleep for 15 mins}
_{Rebalance/Scale in to 4 nodes}
_{Create 10 more fts indexes with following config :}

1. _{1 Index | 1 Replica | 5 Partitions}

1. _{3 Indexes | 2 Replicas | 6 Partitions}

1. _{6 Indexes | 0 Replicas | 4 Partitions"}

_{Run FTS Queries FTS Flex Queries Random fashion for 30 mins} _again

_{Kill memcached}

_{Sleep for 15 mins}

_{Scale in back to 3 nodes}

Test Logs: http://qe-jenkins1.sc.couchbase.com/job/cp-cli-fts-system-test/7/console

Seeing that KV rebalance has hung.
Suspecting it to be a sizing issue as I am seeing default1 bucket to go to 0% RR but even less docs would cause the same, as filed in ~~MB-58014~~.

On this bucket itself the ram used is 3.6GB/4GB making it > 90% of allocated memory, not sure why because all buckets have some size and no of data, but this hints towards undersized cluster.

I am also seeing node getting failed over in this process

Failed over ['ns_1@svc-dqs-node-004.lie3v0iv5ulitlp.sandbox.nonprod-project-avengers.com']: okfailover 000ns_1@svc-dqs-node-001.lie3v0iv5ulitlp.sandbox.nonprod-project-avengers.com8: