Fixed
Pinned fields
Click on the next to a field label to start pinning.
Details
Assignee
Gilad KalchheimGilad Kalchheim(Deactivated)Reporter
Dave RigbyDave Rigby(Deactivated)Is this a Regression?
NoTriage
UntriagedOperating System
Linux x86_64Story Points
0Priority
CriticalInstabug
Open Instabug
Details
Details
Assignee
Gilad Kalchheim
Gilad Kalchheim(Deactivated)Reporter
Dave Rigby
Dave Rigby(Deactivated)Is this a Regression?
No
Triage
Untriaged
Operating System
Linux x86_64
Story Points
0
Priority
Instabug
Open Instabug
PagerDuty
PagerDuty
PagerDuty
Sentry
Sentry
Sentry
Zendesk Support
Zendesk Support
Zendesk Support
Created November 24, 2022 at 9:19 PM
Updated January 20, 2025 at 3:10 PM
Resolved December 1, 2022 at 10:25 AM
Summary
As observed in some Linux environments with Transparent HugePages disabled and large amounts of RAM / bucket quota, some of the diagnostic information gathered by cbcollect_info can result in multi-second pauses of each Couchbase process as the
/proc/<PID>/smaps
information is gathered - up to 20s in some instances.During this time the process is essentially stopped - requests cannot be serviced, resulting in them potentially timing out.
This includes both requests from end-user applications, and internal requests such as the Query Service.
Workaround
(A) Avoid collecting logs during non-idle cluster times - the Kernel issue is triggered when logs are collected.
(B) If log collection is necessary, perform it directly from the command-line and add an additional
--task-regex
argument to exclude the problematic files:Details
The Linux kernel exposes a number of files under
/proc/PID
to introspect the memory state of a interesting Couchbase Server processes. The cbcollect_info script as used for Couchbase log collection reads the contents of some of these files as part of normal diagnostic capture. As of 7.0.4, the captured files are:/proc/<PID>/status
/proc/<PID>/limits
/proc/<PID>/smaps
/proc/<PID>/numa_maps
These are captured for the following processes:
moxi memcached beam.smp couch_compact godu sigar_port cbq-engine indexer projector goxdcr cbft eventing-producer eventing-consumer
The files in
/proc
are not "real" files - they are typically generated on-demand by Linux when the user attempts to read them. Some of these files (smaps
andnuma_maps
) can take a significant amount of the for the kernel to generate for processes which have a large number of entries in their pageable - i.e. large virtual address space. While the kernel is generating the file it can block userspace processes from being scheduled - particularly the process having it's memory state examined.In the case of memcached processes with a Data Service quota of 500GB+, we have observed pauses in excess of 20 seconds:
Note how:
Normal NonIO / WriterPool constantly scheduled tasks running are disrupted for a ~20s period.
Virtually all frontend worker threads experience Slow operations and very long mutex held periods
There's an excellent write-up on this phenomenon at https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/10966#note_410194443 which goes into Linux kernel specifics of what the problem is.
Transparent Hugepages
Note the issue appears to be significantly worse when Transparent Huge Pages is disabled - for example on a node with ~500GB bucket quota and THP set to "never" (as recommended for Couchbase production deployments) we observe pauses of ~20s. With THP set to "always" (out of the box default) no observable pause is seen.