Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Not a Bug
Priority: Major
Fix Version/s: 6.6.1
Affects Version/s: 6.6.1
Component/s: couchbase-bucket
Labels:
None
Environment:
6.6.1-9182 and 6.6.1-9183

Triage:
Triaged
Operating System:
Centos 64-bit
Story Points:
1
Is this a Regression?:
No

Description

Script to Repro

./testrunner -i /tmp/win10-bucket-ops.ini rerun=False -t volumetests.test_system_orchestrator_heartbeats_and_timeouts.volume.test_volume_MB_41562,nodes_init=7,initial_load=3000000,replicas=2

This is a new system test written to test MB-41562.

Steps to Repro
1. Create a 7 node cluster as shown below
------------------------++-------------

Nodes

Services

Status

------------------------++-------------

172.23.105.175	kv	Cluster node
172.23.106.233	None	<--- IN —
172.23.106.236	['kv']	<--- IN —
172.23.106.238	['kv']	<--- IN —
172.23.106.250	['kv']	<--- IN —
172.23.106.251	[‘index’]	<--- IN —
172.23.121.74	[‘n1ql’]	<--- IN —

------------------------++-------------

2. Set non default orchestrator heartbeats and timeouts.

2020-11-01 22:12:51,023 | test  | INFO    | MainThread | [test_system_orchestrator_heartbeats_and_timeouts:test_volume_MB_41562:639] Step 1: Set Non default orchestrator heartbeats and timeouts

curl http://localhost:9000/diag/eval -u Administrator:asdasd -d 'ns_config:set({mb_master, heartbeat_interval}, 500).'

curl http://localhost:9000/diag/eval -u Administrator:asdasd -d 'ns_config:set({mb_master, timeout_interval_count}, 3).’

curl http://localhost:9000/diag/eval -u Administrator:asdasd -d 'ns_config:set({leader_lease_acquire_worker, lease_time}, 5000).'

curl http://localhost:9000/diag/eval -u Administrator:asdasd -d 'ns_config:set({leader_lease_acquire_worker, lease_grace_time}, 2000).'

curl http://localhost:9000/diag/eval -u Administrator:asdasd -d 'ns_config:set({leader_lease_acquire_worker, lease_renew_after}, 500).'

3. Do initial Data load and start running n1ql queries in the background.
2020-11-13 18:49:55,753 | test | INFO | MainThread | [table_view:display:72] Bucket statistics
---------------------------------------------------------------------------

Bucket

Type

Replicas

Durability

TTL

Items

RAM Quota

RAM Used

Disk Used

---------------------------------------------------------------------------

bucket1	membase	2	none	3000000	24950865920	1825757816	1774463492
bucket2	membase	2	none	3000000	24950865920	1794652760	1595313108
bucket3	membase	2	none	3000000	24950865920	1798160104	1669257546
bucket4	membase	2	none	3000000	24950865920	1802871744	1839075769

---------------------------------------------------------------------------

4. Do a rebalance in
2020-11-13 18:49:58,676 | test | INFO | pool-1-thread-3 | [table_view:display:72] Rebalance Overview
------------------------------------

Nodes

Services

Status

------------------------------------

172.23.105.175	kv	Cluster node
172.23.106.250	kv	Cluster node
172.23.106.236	kv	Cluster node
172.23.106.251	index	Cluster node
172.23.106.233	kv	Cluster node
172.23.106.238	kv	Cluster node
172.23.121.74	n1ql	Cluster node
172.23.121.78	None	<--- IN —

------------------------------------

5. Find the orchestrator node, kill babysitter on orchestrator, do a hard failover, start couchbase-server, start delta recovery and rebalance. This is step is repeated 5 times in a loop.

6. Do a rebalance out
2020-11-13 21:29:57,996 | test | INFO | pool-1-thread-10 | [table_view:display:72] Rebalance Overview
------------------------------------

Nodes

Services

Status

------------------------------------

172.23.105.175	kv	Cluster node
172.23.121.78	[u'kv']	— OUT --->
172.23.106.250	kv	Cluster node
172.23.106.236	kv	Cluster node
172.23.106.251	index	Cluster node
172.23.106.233	kv	Cluster node
172.23.106.238	kv	Cluster node
172.23.121.74	n1ql	Cluster node

------------------------------------

7. Flush all the buckets.

Repeat steps 3-7 multiple times.

I had done cbcollect to share with Ns_serv team. I saw the following line in the nutshell.

See CBSE-4320 for details about checkpoint usage.*

Wanted to check if there is something of concern here.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

175-mem.png
641 kB
17/Nov/20 1:05 AM
175-rr.png
437 kB
17/Nov/20 1:05 AM

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Activity

People

Assignee:: Balakumaran Gopal

Reporter:: Balakumaran Gopal

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 15/Nov/20 6:48 PM

Updated:: 17/Nov/20 10:27 PM

Resolved:: 17/Nov/20 1:09 AM

Gerrit Reviews

There are no open Gerrit changes

Slow KV requests to some nodes

Details

Description

Attachments

Attachments

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews

PagerDuty