FIT: Issue in returned QueryIndex: null values for Bucket, Scope and Collection names

Description

The SDK returns QueryIndexes from a Query Index Management operation and omit the Bucket, Scope and Collection (Keyspace) name fields when the Collection/Keyspace is _default.
This bug was found by FIT when implementing .

Resolution
-----------
The BucketName field is supposed to return the Keyspace value if null. Added backing field in QueryIndex for BucketName for this.

Other bugs
------------
The tests intermittently fail due to eventual consistency/driver issues.

Environment

None

Gerrit Reviews

None

Release Notes Description

None

Activity

Show:

Graham Pople July 7, 2023 at 12:49 PM
Edited

(5) will also get handled by QueryUtils.createPrimaryIndex().  That's a common one to see when rerunning tests.

So (4) is indeed pretty weird with that update.  I assume those fields all come directly from the query response?  If so, I wonder if you've stumbled across a legit query service error...  Though I am curious why no-one else is hitting it.

Emilien Bevierre July 7, 2023 at 12:42 PM

Thanks for the detailed breakdown, I'll hop on getting those fixed.
For #4 I suspected this too, but what bothers me is that it doesn't fail consistently. I re-ran the test file a couple of times and added the Bucket/Scope/Collection name fields to the debug ouput:

So the Bucket/Scope/Collection name fields are null in the QueryIndex, but this only happens sometimes and works fine if I run the test independently. I'll assume it's something to do with the scope/collection creation and hopefully fixing the other issues first will resolve this one too.

I'll also add this to the intermittent errors:

type: SDK_INDEX_EXISTS_EXCEPTION
serialized: "Couchbase.Core.Exceptions.IndexExistsException: The index #primary already exists.

Graham Pople July 7, 2023 at 12:24 PM

Oh I think I misread (4).  That looks to be something like the performer has returned the correct index name, but maybe not the correct scope or bucket or collection name?  Which would tie up both with this ticket, and with this issue not being seen in other SDKs.  So maybe we don't need all that new consistency logic.

Graham Pople July 7, 2023 at 10:26 AM
Edited

Thanks , and yes I've seen all of these at various times.  Please add these workarounds into the driver side so that all SDKs benefit:

  1. This will probably be solved by a) waiting for the scope to exist on all nodes and b) waiting for the indexer to be aware of the collection the scope is on.   For (a) can use ConsistencyUtil.waitUntilCollectionPresent and for (b) can use ClusterUtil.waitForQueryIndexerToHaveCollection - actually waitForQueryIndexerToHaveCollection is probably sufficient on its own.  Alternatively, if you're creating an index here, can use QueryUtils.createPrimaryIndex which will automatically do waitForQueryIndexerToHaveCollection for you.

  2. Urgh, that's a pain.  If you're creating an index then I'd use  QueryUtils.createPrimaryIndex which will automatically retry a few times on failure.  Probably to handle exactly this error.

  3. Looks like similar issues to (1) and (2).  I'd just use QueryUtils.createPrimaryIndex here too.  If you need to create a non-primary index, feel free to create a new similar method.

  4. Hmm.  I suspect this one is a consistency issue.  Maybe the SDK has created the index on node 1, but it's not yet propagated to the other nodes yet.  Then the index query is hitting node 3 and the index doesn't exist there yet. 
    I'm afraid it's a little more work to fix this one, since we don't have a ready-rolled consistency method for this.  I would fix it by adding in to TempUtil a new waitUntilQueryIndexPresent().  This code would poll a query index REST endpoint on all nodes until they all 200 for this index.  If you look a the existing methods here you'll see it's not much work - there's helper routines that will do most of it.
    I'd then call this waitUntilQueryIndexPresent() in every test that creates an index and then queries on it.  That should fix it.

Jeffry Morris July 6, 2023 at 10:05 PM

-

It might be worth looking at doing a delay between operations or perhaps writing an extension method that takes a func and retries it until it succeeds or times out and then adjusting for exceptions types handled. If done correctly, it could be reusable and help with any of the "eventual consistency" problems.

Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Story Points

Sprint

Fix versions

Priority

Instabug

Open Instabug

PagerDuty

Sentry

Zendesk Support

Created April 4, 2023 at 12:48 PM
Updated July 10, 2023 at 3:07 PM
Resolved July 10, 2023 at 3:07 PM
Instabug