Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: 7.1.0
Affects Version/s: 7.1.0
Component/s: couchbase-bucket
Labels:
Environment:
Build number: 7.1.0-1361

OS: Amazon Linux 2
ARM instance: m6g.large

2vCPU
8GB Memory
40GB EBS

Triage:
Triaged
Epic Link:
KV: 1% Residency Ratio
Story Points:
1
Is this a Regression?:
Unknown
Sprint:
KV 2021-Nov

Description

During rebalance performance tests on ARM AWS instances, the tests consistently hang - an example job can be found here along with the logs:

http://perf.jenkins.couchbase.com/job/Cloud-Tester/600/

https://s3.amazonaws.com/bugdb/jira/qe/collectinfo-2021-10-07T223241-ns_1%40ec2-3-219-56-9.compute-1.amazonaws.com.zip
https://s3.amazonaws.com/bugdb/jira/qe/collectinfo-2021-10-07T223241-ns_1%40ec2-3-223-6-164.compute-1.amazonaws.com.zip
https://s3.amazonaws.com/bugdb/jira/qe/collectinfo-2021-10-07T223241-ns_1%40ec2-44-195-22-82.compute-1.amazonaws.com.zip

The rebalance seems to hang on 'still waiting for backfill on connection', this happens 115 times in the logs:

[rebalance:debug,2021-10-07T22:35:41.445Z,ns_1@ec2-44-195-22-82.compute-1.amazonaws.com:<0.1108.3>:dcp_replicator:wait_for_data_move_on_one_node:192]Still waiting for backfill on connection "replication:ns_1@ec2-44-195-22-82.compute-1.amazonaws.com->ns_1@ec2-3-223-6-164.compute-1.amazonaws.com:bucket-1" bucket "bucket-1", partition 745, last estimate {0,0, <<"calculating-item-count">>}

During this time memcached keeps returning <<"calculating-item-count">> with no estimation, CPU usage also spikes at this time.

Attachments

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

cbcollect_info_ns_1@ec2-3-235-136-83.compute-1.amazonaws.com_20211020-120410.zip
9.64 MB
05/Nov/21 6:14 AM
cbcollect_info_ns_1@ec2-3-237-95-29.compute-1.amazonaws.com_20211020-120409.zip
8.69 MB
05/Nov/21 6:14 AM
cbcollect_info_ns_1@ec2-3-238-93-68.compute-1.amazonaws.com_20211020-120410.zip
12.87 MB
05/Nov/21 6:14 AM
MB-49037_b1695.png
389 kB
17/Nov/21 6:11 AM
MB-49037_dcp-backoff.png
41 kB
11/Nov/21 6:24 AM
MB-49037_HT-ejection.png
52 kB
17/Nov/21 1:04 AM
MB-49037_ht-mem.png
96 kB
15/Nov/21 8:14 AM
MB-49037_mem.png
135 kB
11/Nov/21 6:24 AM
Screenshot 2021-10-20 at 13.37.03.png
59 kB
20/Oct/21 5:50 AM
Screenshot 2021-10-20 at 13.49.48.png
267 kB
20/Oct/21 5:51 AM
Screenshot 2021-10-20 at 13.52.05.png
60 kB
20/Oct/21 5:52 AM
x86 dashboard.png
320 kB
20/Oct/21 6:23 AM

Issue Links

is duplicated by

MB-48825 AWS rebalance performance tests hard out of memory error

Closed

relates to

MB-49134 cbbackupmgr restore failed on build 7.1.0-1558

Closed

Gerrit Reviews

- Issue Only
- Show All Reviews
- Show Open Reviews
- Show All Issues
- Show Open Issues

For Gerrit Dashboard: MB-49037
#	Subject	Branch	Project	Status	CR	V
165745,4	MB-49037: Add ep_ht_item_memory stat	master	kv_engine	Status: MERGED	+2	+1

Activity

People

Assignee:: Daniel Owen

Reporter:: Sean Corrigan

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 20/Oct/21 3:44 AM

Updated:: 25/Mar/22 2:06 AM

Resolved:: 17/Nov/21 6:28 AM

Gerrit Reviews

There are no open Gerrit changes

Show There is 1 closed Gerrit change

Hide There is 1 closed Gerrit change

MB-49037: Add ep_ht_item_memory stat: Gerrit Review:

PagerDuty