Details
-
Bug
-
Resolution: Fixed
-
Critical
-
3.0.3
-
Security Level: Public
-
Triaged
-
Yes
Description
This appears to be a continuation of the problem reported in CB-1407/MB-12670, which is listed as fixed in 3.0.2. In that bug, customer reported extremely long runtime for cbbackup in V3.0.1 relative to earlier runtime when the cluster ran on V2.2. In addition, customer reported that in 3.0.1 all available RAM was consumed such that the OOM-killer was invoked and started killing Couchbase processes, resulting in node failover.
In this case, a different customer reported the same problem moving from 2.5.1 to 3.0.3, and reported that in their case cbbackup was now causing frequent failovers which in turn result from the OOM-killer being invoked.
While I have not been able to cause a failover or cause the OOM-killer to be invoked, I have reproduced slow cbbackup runtime and much greater memory consumption in 3.0.3 relative to 2.5.2. To reproduce I did the following:
1) Create 6-node cluster on 2.5.2 with beer-sample and default buckets. Populate the default bucket with 10 million 250 byte JSON items (I used cbworkloadgen)
2) perform cbbackup of the cluster with the linux "time" command. I received the following results:
.
[morrie@ip-10-249-139-86 tmp]$ time cbbackup http://127.0.0.1:8091 /tmp/BACKUP -u Administrator -p password
...
done
real 8m20.214s
user 3m6.024s
sys 1m19.961s
.
Additionally TOP showed Virt 482M, Res 41M SHR 4948 CPU apx 4.19
3) swap-rebalance upgrade of cluster to 3.0.3
4) repeat cbbackup on same node. Get the following results:
.
time cbbackup http://127.0.0.1:8091 /tmp/BACKUP -u Administrator -p password
################### 100.0% (7303/estimated 7303 msgs)
...
done
real 23m23.213s
user 6m4.847s
sys 3m40.270s
TOP showed Virt 819M, Res 7288 SHR 4948 CPU apx 9:40
TOP also showed overall memory consumption continuously increasing until the end of the cbbackup.
Note that the actual output of the 2 cbbackups was essentially the same size and that the messages statistics for both runs were the same.
Attachments
Issue Links
- blocks
-
MB-14772 3.1.0 Minor Release
- Resolved
For Gerrit Dashboard: MB-14833 | ||||||
---|---|---|---|---|---|---|
# | Subject | Branch | Project | Status | CR | V |
51658,3 | MB-14833: fix large memory consumption when performing cbbackup using DCP. Problem was pump_dcp.py was using an unlimited Queue for buffering DCP messages. This fix limits queue length to 1000, by default, but can be configured with the option: '-x dcp_consumer_queue_length=<queue length>' | master | couchbase-cli | Status: MERGED | +2 | +1 |
51791,1 | MB-14833: fix large memory consumption when performing cbbackup using DCP. Problem was pump_dcp.py was using an unlimited Queue for buffering DCP messages. This fix limits queue length to 1000, by default, but can be configured with the option: '-x dcp_consumer_queue_length=<queue length>' | 3.0.x | couchbase-cli | Status: ABANDONED | 0 | 0 |
51792,3 | MB-14833: fix large memory consumption when performing cbbackup using DCP. | 3.0.x | couchbase-cli | Status: MERGED | +2 | +1 |
51881,2 | MB-14833: fix large memory consumption when performing cbbackup using DCP. | 3.0.x | couchbase-cli | Status: ABANDONED | 0 | 0 |