Details
Description
Seeing very intermittent/choppy stats on some nodes[109, 105, 108, 106, ..] on the large cluster bucket2.
http://10.3.2.84:8091/index.html#sec=analytics&statsBucket=%2Fpools%2Fdefault%2Fbuckets%2Fbucket2&graph=disk_write_queue&statsHostname=10.3.2.109%3A8091&zoom=zoom_minute
The get_stat script shows all the stat calls < 3 over a period of time.
522000 calls < 3 secs , 0 calls > 3
524000 calls < 3 secs , 0 calls > 3
526000 calls < 3 secs , 0 calls > 3
528000 calls < 3 secs , 0 calls > 3
530000 calls < 3 secs , 0 calls > 3
532000 calls < 3 secs , 0 calls > 3
534000 calls < 3 secs , 0 calls > 3
536000 calls < 3 secs , 0 calls > 3
538000 calls < 3 secs , 0 calls > 3
540000 calls < 3 secs , 0 calls > 3
542000 calls < 3 secs , 0 calls > 3
544000 calls < 3 secs , 0 calls > 3
top - shows 2 processes - memcahed, beam of which memcached has 46 percent mem, and total cpu : 22 percent, similar across other regular functioning nodes too.
---------------------
top - 10:17:59 up 10 days, 1:00, 1 user, load average: 4.50, 4.61, 4.79
Tasks: 131 total, 2 running, 129 sleeping, 0 stopped, 0 zombie
Cpu(s): 8.9%us, 2.8%sy, 0.0%ni, 60.6%id, 26.2%wa, 0.3%hi, 1.3%si, 0.0%st
Mem: 10230648k total, 10080788k used, 149860k free, 209548k buffers
Swap: 5210104k total, 104k used, 5210000k free, 4203332k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
11047 couchbas 25 0 548m 308m 2560 S 26.6 3.1 1053:08 beam.smp
11174 couchbas 15 0 4847m 4.5g 4060 S 22.0 46.3 427:51.40 memcached
736 root 10 -5 0 0 0 D 0.3 0.0 3:56.50 kjournald
2906 root 15 0 46640 2896 2328 S 0.3 0.0 1:10.56 vmtoolsd
iostat look ok.
---------------------
Linux 2.6.18-308.el5 (cen-2109) 06/25/2012
avg-cpu: %user %nice %system %iowait %steal %idle
2.94 0.03 1.72 11.43 0.00 83.88
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 40.38 0.00 0.37 2447 319838
sda1 0.00 0.00 0.00 1 0
sda2 40.38 0.00 0.37 2446 319838
sdb 38.41 0.01 0.27 5569 235017
dm-0 94.57 0.00 0.37 2445 319839
dm-1 0.00 0.00 0.00 0 0
dm-2 70.38 0.01 0.27 5568 235017
The node has 10 percent fragmentation,and is in 0% swap[UI].
Bug - 2214 on 1.6 shows some similar behaviour.
http://www.couchbase.com/issues/browse/MB-2214
Attached is a screenshot from node 109