Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Fixed
Priority: Critical
Fix Version/s: 7.0.2
Affects Version/s: 7.0.0
Component/s: ns_server
Labels:
- request-dev-verify

Triage:
Untriaged
Story Points:
1
Is this a Regression?:
Unknown

Description

I noticed this during some testing were I was creating and deleting buckets in rapid succession. During the delete ns_server waits until the bucket is considered to be "not active" on the KV nodes in the cluster. In this case, ns_server waited even though there were no nodes to be waited on. Delete bucket then timed out. Here's the key log trace:

[ns_server:warn,2021-08-19T10:49:23.656-07:00,n_0@127.0.0.1:<0.1628.0>:ns_orchestrator:idle:640]Nodes [] failed to delete bucket "b_0" within expected time.

You can see that the list of nodes that didn't respond in time is empty. The issue appears to be that we wait even though there are no nodes to wait on. This change appears to address the issue: http://review.couchbase.org/c/ns_server/+/159720 and may be useful to start from.

For completeness, here's where the bucket was created:

ns_server:info,2021-08-19T10:48:53.632-07:00,n_0@127.0.0.1:ns_memcached-b_0<0.4374.0>:ns_memcached:do_ensure_bucket:1303]Created bucket "b_0" with config string "max_size=104857600;dbname=/Users/dfinlay/work8/ns_server/data/n_0/data/b_0;backend=couchdb;couch_bucket=b_0;max_vbuckets=64;alog_path=/Users/dfinlay/work8/ns_server/data/n_0/data/b_0/access.log;data_traffic_enabled=false;max_num_workers=3;uuid=910c2397af610d21ac57e5a6ce842154;conflict_resolution_type=seqno;bucket_type=persistent;durability_min_level=none;pitr_enabled=false;pitr_granularity=600;pitr_max_history_age=86400;magma_fragmentation_percentage=50;item_eviction_policy=value_only;persistent_metadata_purge_age=259200;max_ttl=0;ht_locks=47;compression_mode=passive;failpartialwarmup=false"

Immediately the bucket is deleted. There's no log message for this but we can see it in the following traces:

[ns_server:debug,2021-08-19T10:48:53.652-07:00,n_0@127.0.0.1:ns_janitor_server<0.1625.0>:ns_janitor_server:handle_call:101]Deleted bucket "b_0" from janitor_requests

...

[ns_server:debug,2021-08-19T10:48:53.653-07:00,n_0@127.0.0.1:ns_bucket_worker<0.617.0>:ns_bucket_worker:stop_one_bucket:108]Stopping child for dead bucket: "b_0"

...

[ns_server:debug,2021-08-19T10:48:53.653-07:00,n_0@127.0.0.1:chronicle_kv_log<0.393.0>:chronicle_kv_log:log:61]update (key: bucket_names, rev: {<<"f9584bdff866f34dc3dcce65b25cdd6a">>,25659})

["b_3502","default","travel-sample"]

And 30 s later the request times out even though there aren't any nodes that need to be waited on.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

n_0.zip
10.15 MB
19/Aug/21 11:15 AM
n_1.zip
6.58 MB
19/Aug/21 11:15 AM
n_2.zip
9.71 MB
19/Aug/21 11:15 AM

Issue Links

relates to

MB-46643 Couchbase fails to create collection index right after the creation of a collection

Reopened

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Artem Stemkovski

Reporter:: Dave Finlay

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 19/Aug/21 11:14 AM

Updated:: 23/Aug/21 8:39 AM

Resolved:: 19/Aug/21 5:49 PM

Gerrit Reviews

There are no open Gerrit changes

Show There are 3 closed Gerrit changes

Hide There are 3 closed Gerrit changes

MB-48059: Don't wait if there are no nodes to wait on: Gerrit Review:

MB-48059 don't timeout on bucket delete if the bucket is not active: Gerrit Review:

Merge remote-tracking branch 'gerrit/cheshire-cat': Gerrit Review:

Delete bucket times out waiting for no nodes

Details

Description

Attachments

Attachments

Issue Links

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty