Details
-
Bug
-
Resolution: Fixed
-
Critical
-
7.0.0
-
Untriaged
-
1
-
Unknown
Description
I noticed this during some testing were I was creating and deleting buckets in rapid succession. During the delete ns_server waits until the bucket is considered to be "not active" on the KV nodes in the cluster. In this case, ns_server waited even though there were no nodes to be waited on. Delete bucket then timed out. Here's the key log trace:
[ns_server:warn,2021-08-19T10:49:23.656-07:00,n_0@127.0.0.1:<0.1628.0>:ns_orchestrator:idle:640]Nodes [] failed to delete bucket "b_0" within expected time.
|
You can see that the list of nodes that didn't respond in time is empty. The issue appears to be that we wait even though there are no nodes to wait on. This change appears to address the issue: http://review.couchbase.org/c/ns_server/+/159720 and may be useful to start from.
For completeness, here's where the bucket was created:
ns_server:info,2021-08-19T10:48:53.632-07:00,n_0@127.0.0.1:ns_memcached-b_0<0.4374.0>:ns_memcached:do_ensure_bucket:1303]Created bucket "b_0" with config string "max_size=104857600;dbname=/Users/dfinlay/work8/ns_server/data/n_0/data/b_0;backend=couchdb;couch_bucket=b_0;max_vbuckets=64;alog_path=/Users/dfinlay/work8/ns_server/data/n_0/data/b_0/access.log;data_traffic_enabled=false;max_num_workers=3;uuid=910c2397af610d21ac57e5a6ce842154;conflict_resolution_type=seqno;bucket_type=persistent;durability_min_level=none;pitr_enabled=false;pitr_granularity=600;pitr_max_history_age=86400;magma_fragmentation_percentage=50;item_eviction_policy=value_only;persistent_metadata_purge_age=259200;max_ttl=0;ht_locks=47;compression_mode=passive;failpartialwarmup=false"
|
Immediately the bucket is deleted. There's no log message for this but we can see it in the following traces:
[ns_server:debug,2021-08-19T10:48:53.652-07:00,n_0@127.0.0.1:ns_janitor_server<0.1625.0>:ns_janitor_server:handle_call:101]Deleted bucket "b_0" from janitor_requests
|
...
|
[ns_server:debug,2021-08-19T10:48:53.653-07:00,n_0@127.0.0.1:ns_bucket_worker<0.617.0>:ns_bucket_worker:stop_one_bucket:108]Stopping child for dead bucket: "b_0"
|
...
|
[ns_server:debug,2021-08-19T10:48:53.653-07:00,n_0@127.0.0.1:chronicle_kv_log<0.393.0>:chronicle_kv_log:log:61]update (key: bucket_names, rev: {<<"f9584bdff866f34dc3dcce65b25cdd6a">>,25659})
|
["b_3502","default","travel-sample"]
|
And 30 s later the request times out even though there aren't any nodes that need to be waited on.
Attachments
Issue Links
- relates to
-
MB-46643 Couchbase fails to create collection index right after the creation of a collection
- Reopened