Couchbase fails to create collection index right after the creation of a collection

Description

 

CollectionManager collectionManager = bucket.collections(); try { CollectionSpec spec = CollectionSpec.create(CollectionNames.PROFILE, bucket.defaultScope().name()); collectionManager.createCollection(spec); //Thread.sleep(10000); } catch (CollectionExistsException e){ System.out.println(String.format("Collection <%s> already exists", CollectionNames.PROFILE)); } catch (Exception e) { System.out.println(String.format("Generic error <%s>",e.getMessage())); } try { final QueryResult result = cluster.query("CREATE PRIMARY INDEX default_profile_index ON "+props.getBucketName()+"._default."+ CollectionNames.PROFILE); for (JsonObject row : result.rowsAsObject()){ System.out.println(String.format("Index Creation Status %s",row.getObject("meta").getString("status"))); } } catch (IndexExistsException e){ System.out.println(String.format("Collection's primary index already exists")); } catch (Exception e){ System.out.println(String.format("General error <%s> when trying to create index ",e.getMessage())); }

We tried on multiple languages and we always get the same error:

 

{"completed":true,"coreId":"0x1345501c00000001","errors":[{"code":12003,"message":"Keyspace not found in CB datastore: default:user_profile._default.profile"}],"idempotent":false,"lastDispatchedFrom":"127.0.0.1:63120","lastDispatchedTo":"localhost:8093","requestId":13,"requestType":"QueryRequest","retried":0,"service":{"operationId":"526fcd12-023b-4628-a510-1a250794725c","statement":"DELETE FROM user_profile._default.profile ","type":"query"},"timeoutMs":75000,"timings":{"dispatchMicros":1132,"totalDispatchMicros":1132,"totalMicros":1952}}

But if we add a sleep of 10 seconds, the code works fine. Ideally, the SDK should only send the acknowledgment after the collection is created. 

Affects versions

Fix versions

Labels

Environment

None

Release Notes Description

None

Activity

Vinathi Kanna September 25, 2024 at 4:06 AM

Removing Morpheus fix version and adding Ponyo.

Marco Greco July 20, 2021 at 11:32 AM

From a query point of view, I wouldn't be particularly happy with the need for yet another scan_vector / timestamp / etc that has to be passed around: this is yet more work for the developer (and to be frank, the people that we are trying to attract are used to not having to do this sort of stuff - we would be heading to another 'why do we have to create a primary index?' scenario), and having more parameters that have to be handled at REST API call handling is something that we want to avoid, as it dents N1QL's throughput.
I think that, although there is merit in having DDL asynchronous, and then checking for consistency at the end of the script, since DDL is likely going to be dependent on other DDL, the reality is that scripts will end up having many specific checks for individual scopes and collections, making them more difficult to develop and more inefficient to run, which defeats the object of the exercise.

This said, if it is too cumbersome for ns_server to switch to synchronous DDL, as long as a primitive to check that a particular DDL has completed via some "wait until some timestamp has been replicated" API is provided, it would be simple enough for N1QL to make create / drop scope /collection synchronous by using said primitive ourselves.

It would not be a complete solution - but it would address the problem at least from a N1QL and N1QL user (SDK, cbq, UI) point of view.

Matt Ingenthron July 19, 2021 at 5:05 PM

hey , you updated the description just now saying…

But if we add a sleep of 10 seconds, the code works fine. Ideally, the SDK should only send the acknowledgment after the collection is created.

At the moment, there is no way to determine when the collection has been created from the cluster. There is no way for an SDK to know reliably that an indexer process is aware of the collection creation. So it's not about the "SDK sending acknowledgement" per se.

From a meeting last Friday, we talked about two options:
1) Add some kind of 'synchronous' option to these operations at ns_server level, etc. This could then be used by SDK, cbq, cbrestore, other tools. Or…
2) Add a method of retrieving the configuration in use and pass that along to other services with future requests, so those services can block until their configuration is at that level or later before processing the request

Discussion ended with agreement that we want to get it on the list as a 7.1 requirement, even if a subset, with a roadmap to covering many of these similar cases. Other examples include bucket creation, bucket flush, GSI index build, FTS index build, etc. All of these are important to the developer use case with test scaffolding.

Denis Souza Rosa July 9, 2021 at 10:15 AM

My take on this: It is a critical/blocker bug.

On the apps that we built using CB 7.0, we use scopes as tenants. Whenever a new "client" comes in and creates a new tenant we need to create a new scope, a number of collections, indexes, and insert some basic data. Today the only way that we can do it successfully is to add a bunch of "sleeps" in my code and hope when the app wakes up, the collection has already been created.

Even if we try to push the responsibility to the developer,  at least when I reported this bug, there is no proper way on the SDK to check if a collection exists (other than trying to create a collection and handling the error).

Kamini Jagtiani July 8, 2021 at 2:37 AM
Edited

If a user gives 

     create scope

     create collection

     create index

     insert

and insert accepts metadata consistency predicate returned by one of the create statements, it means we are putting the onus on the developer, and every DML stmt(insert/update/delete/merge/nest/unnest) should now need to accept an optional query parameter with some kind of metadata

Should we instead try to make the create statements blocking?

The N1QL node, that has received the create scope/collection statement - after successful execution will probe other nodes to see if they have the scope /collection and will return success only after it receives a success from all nodes.

I do recognize that create will become a slow statement to execute depending on the size of the cluster.

But create scope or create collection are our schema statements- not done often.

Why alter the behavior of more commonly used DML statements?

 

Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Priority

Instabug

Open Instabug

PagerDuty

Sentry

Zendesk Support

Created May 31, 2021 at 9:39 AM
Updated November 7, 2024 at 2:57 PM
Instabug