Details
-
Bug
-
Resolution: Fixed
-
Critical
-
4.1.0, 4.1.1
-
2x CB Server nodes(each with Intel(R) Xeon(R) CPU E7-4820 v2 @ 2.00GHz, 64GbE memory and >500GB disks with H/W RAID, GbE network)
The Couchbase 4.1.0 are deployed on two Server nodes.
2x CB Client nodes(each with Intel(R) Xeon(R) CPU E7-4820 v2 @ 2.00GHz, 64GbE memory and >500GB disks with H/W RAID, GbE network)
The latest stable YCSB are deployed on two Client nodes.2x CB Server nodes(each with Intel(R) Xeon(R) CPU E7-4820 v2 @ 2.00GHz, 64GbE memory and >500GB disks with H/W RAID, GbE network) The Couchbase 4.1.0 are deployed on two Server nodes. 2x CB Client nodes(each with Intel(R) Xeon(R) CPU E7-4820 v2 @ 2.00GHz, 64GbE memory and >500GB disks with H/W RAID, GbE network) The latest stable YCSB are deployed on two Client nodes.
-
Triaged
-
Centos 64-bit
-
Unknown
Description
When testing the CB 4.1.0 with latest stable YCSB. I found that in a two-node cluster, one node has normal CPU utilization but another has much higher CPU utilization.
Below are the testing procedure and screenshot.
1. execute below ycsb testing command from both Client nodes with the record count =20000000
On Client Node 1:
./bin/ycsb load couchbase -s -P ./workloads/workloada -p couchbase.url=http://22.188.5.105:8091/pools -threads 192 -p couchbase.bucket=testdb1
On Client Node 2:
./bin/ycsb load couchbase -s -P ./workloads/workloada -p couchbase.url=http://22.188.5.105:8091/pools -threads 192 -p couchbase.bucket=testdb1 -p insertstart=20000001
2. After both client ycsb are launched and stabilized, below are the CB Server load of each server node.
+3. Below is the two-node cluster total load +
4. Below are each CB server nodes' load status
*+
5. Below are each CB server nodes' TOP status+*
Two CB servers has completely same hardware and CentOS 6.4 configuration(cores/numa-off/THP off/disk partition layout/SElinux disable/Iptables off...). And we also testing the network latency between each CB Sever node and the network is pretty well. (via qperf utility)
You can see the cpu utilization of the node IP *.105 are much higher than *.104 in the test.
Is this a bug/issue or a normal characteristic of CB ? And what actions/activities or jobs significantly affect Server node *.105 's CPU , any other telemetry way can monitoring the unusual node when this happen?
Thanks,