Search: 1K indexes: One of index creation failed with i/o timeout

Description

Build: 7.1.0-1566
Test: -test tests/fts/cheshire-cat/test_fts_clusterops_cheshire_cat_coll_crud_freetier.yml -scope tests/fts/cheshire-cat/scope_fts_cheshire_cat_free_tier.yml

Cluster with 3 nodes having kv,n1ql, search, index on all the nodes
Create 1 bucket, 100 scopes and 10 collections in each scopes
Create 2500 GSI indexes ( 5 on each collection)
Load documents on some of the collections
Created 1000 indexes: one index (1 partition) on each collection
Run queries on each collection
Mutate the documents on each collection and wait for all the index to process mutation
Run queries on each collection
Delete all the indexes

During creation of 1000 indexes, one of the index which is requested to create on scope_78.coll_1, failed with below error:

test log:

Do not see a significant info in fts log for this request. But here are the logs at the above timestamp.

Logs:
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1635430507/collectinfo-2021-10-28T141509-ns_1%40172.23.100.161.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1635430507/collectinfo-2021-10-28T141509-ns_1%40172.23.100.162.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1635430507/collectinfo-2021-10-28T141509-ns_1%40172.23.100.163.zip

Environment

None

Link to Log File, atop/blg, CBCollectInfo, Core dump

None

Release Notes Description

None

Linked issues

relates to

MB-48848

Search: 1K indexes: some indexes' create/delete requests fail with server write time out

Activity

Show:

Sreekanth Sivasankaran January 25, 2022 at 10:41 AM

Fixes for the issue would help fix this one as well.

Abhi Dangeti January 12, 2022 at 12:55 AM

Would you share the logs where you have seen this issue while using builds later than 7.1.0-1643.

CB robot November 4, 2021 at 6:21 PM

Build couchbase-server-7.1.0-1643 contains cbft commit 956e193 with commit message:
: Prefix context to restRequestParser errors

Abhi Dangeti November 3, 2021 at 2:03 PM

Been looking at so many metaKV issues, guessed this could've been another. This call in the preparePerms() code path can involve a metaKV fetch ..

https://github.com/couchbase/cbft/blob/master/rest_auth.go#L364

The change I reverted is to continue testing of enforcing limits (on fts indexes - we can chat on this separately). Since I thought it could affect this test, I asked for a retest.

But now that I look at the error again ..

Looks like the i/o timeout is between node 172.23.100.161 and 172.23.107.77 - which is not even part of the cluster. So possibly the client? If this is the request parser timing out - believe it's the first time we're seeing this.

Sreekanth Sivasankaran November 3, 2021 at 5:35 AM
Edited

, may I know which of your commits from the other ticket is supposed to address this timeout?

Not sure whether I could see a metakv fetch here in the preparePerms() call.

Doesn't this look more like a socket read i/o timeout while reading the request contents itself?

Aside from this, now that you reverted and brought back metakv fetches, aren't we supposed to see the former metakv related timeouts too going forward?

Duplicate

Pinned fields

Click on the next to a field label to start pinning.

Details

Assignee

Girish Benakappa

Reporter

Girish Benakappa

Is this a Regression?

Unknown

Triage

Untriaged

Story Points

Priority

Major

Instabug

Open Instabug

PagerDuty

Sentry

Zendesk Support

Created November 2, 2021 at 2:25 AM

Updated January 25, 2022 at 10:42 AM

Resolved January 25, 2022 at 10:42 AM

Configure

Instabug

Search: 1K indexes: One of index creation failed with i/o timeout

Description

Components

Affects versions

Fix versions

Labels

Environment

Link to Log File, atop/blg, CBCollectInfo, Core dump

Release Notes Description

Linked issues

relates to

Activity

Sreekanth Sivasankaran January 25, 2022 at 10:41 AM

Abhi Dangeti January 12, 2022 at 12:55 AM

CB robot November 4, 2021 at 6:21 PM

Abhi Dangeti November 3, 2021 at 2:03 PM

Sreekanth Sivasankaran November 3, 2021 at 5:35 AMEdited

Details

Assignee

Reporter

Is this a Regression?

Triage

Story Points

Priority

Instabug

PagerDuty

Sentry

Zendesk Support

Sreekanth Sivasankaran November 3, 2021 at 5:35 AM
Edited