cbcollect_info reading /proc/<PID>/smaps causes 20+ second pauses in CB processes

Description

Summary

As observed in some Linux environments with Transparent HugePages disabled and large amounts of RAM / bucket quota, some of the diagnostic information gathered by cbcollect_info can result in multi-second pauses of each Couchbase process as the /proc/<PID>/smaps information is gathered - up to 20s in some instances.

During this time the process is essentially stopped - requests cannot be serviced, resulting in them potentially timing out.

This includes both requests from end-user applications, and internal requests such as the Query Service.

Workaround

(A) Avoid collecting logs during non-idle cluster times - the Kernel issue is triggered when logs are collected.

(B) If log collection is necessary, perform it directly from the command-line and add an additional --task-regex argument to exclude the problematic files:

Details

The Linux kernel exposes a number of files under /proc/PID to introspect the memory state of a interesting Couchbase Server processes. The cbcollect_info script as used for Couchbase log collection reads the contents of some of these files as part of normal diagnostic capture. As of 7.0.4, the captured files are:

  1. /proc/<PID>/status

  2. /proc/<PID>/limits

  3. /proc/<PID>/smaps

  4. /proc/<PID>/numa_maps

These are captured for the following processes: moxi memcached beam.smp couch_compact godu sigar_port cbq-engine indexer projector goxdcr cbft eventing-producer eventing-consumer

The files in /proc are not "real" files - they are typically generated on-demand by Linux when the user attempts to read them. Some of these files (smaps and numa_maps) can take a significant amount of the for the kernel to generate for processes which have a large number of entries in their pageable - i.e. large virtual address space. While the kernel is generating the file it can block userspace processes from being scheduled - particularly the process having it's memory state examined.

In the case of memcached processes with a Data Service quota of 500GB+, we have observed pauses in excess of 20 seconds:

Note how:

  • Normal NonIO / WriterPool constantly scheduled tasks running are disrupted for a ~20s period.

  • Virtually all frontend worker threads experience Slow operations and very long mutex held periods

There's an excellent write-up on this phenomenon at https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/10966#note_410194443 which goes into Linux kernel specifics of what the problem is.

Transparent Hugepages

Note the issue appears to be significantly worse when Transparent Huge Pages is disabled - for example on a node with ~500GB bucket quota and THP set to "never" (as recommended for Couchbase production deployments) we observe pauses of ~20s. With THP set to "always" (out of the box default) no observable pause is seen.

Components

Labels

Environment

None

Link to Log File, atop/blg, CBCollectInfo, Core dump

None

Release Notes Description

None

Attachments

2
100% Done
Loading...

Activity

CB robot January 12, 2023 at 8:55 PM

Build couchbase-server-7.1.4-3555 contains ns_server commit da7902b with commit message:
: cbcollect_info: Don't read /proc/PID/{smaps,numa_maps}

CB robot January 12, 2023 at 8:55 PM

Build couchbase-server-7.1.4-3555 contains ns_server commit 2f7f5c1 with commit message:
: cbcollect_info: Don't read /proc/PID/{smaps,numa_maps}

Gilad Kalchheim December 5, 2022 at 12:02 PM

The change is quite isolated and small. Dave has already tested it functionally. I verified that it's in the installation of build 7.0.5-7658 by inspecting it. Given the constraints we have, I think that should suffice. 

CB robot December 3, 2022 at 10:11 AM

Build couchbase-server-8.0.0-1182 contains ns_server commit da7902b with commit message:
: cbcollect_info: Don't read /proc/PID/{smaps,numa_maps}

CB robot December 3, 2022 at 10:11 AM

Build couchbase-server-8.0.0-1182 contains ns_server commit 2f7f5c1 with commit message:
: cbcollect_info: Don't read /proc/PID/{smaps,numa_maps}

Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Is this a Regression?

No

Triage

Untriaged

Operating System

Linux x86_64

Story Points

Priority

Instabug

Open Instabug

PagerDuty

Sentry

Zendesk Support

Created November 24, 2022 at 9:19 PM
Updated January 20, 2025 at 3:10 PM
Resolved December 1, 2022 at 10:25 AM
Instabug