Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-5672

[longevity]Item counts/disk-Write Q dips intermittently on large cluster bucket2[ only activity on the cluster - very small load running on the cluster]

    XMLWordPrintable

Details

    • Bug
    • Resolution: Won't Fix
    • Major
    • 2.0
    • 1.8.1
    • UI
    • Security Level: Public
    • None
    • 17 Node cluster, Centos
      3 buckets - bucket1, bucket2, default
      * Cluster is running for a week with small load running against it. [ < 2k ops/sec]
      * Cluster is in DGM.

    Description

      Seeing very intermittent/choppy stats on some nodes[109, 105, 108, 106, ..] on the large cluster bucket2.
      http://10.3.2.84:8091/index.html#sec=analytics&statsBucket=%2Fpools%2Fdefault%2Fbuckets%2Fbucket2&graph=disk_write_queue&statsHostname=10.3.2.109%3A8091&zoom=zoom_minute

      The get_stat script shows all the stat calls < 3 over a period of time.
      522000 calls < 3 secs , 0 calls > 3
      524000 calls < 3 secs , 0 calls > 3
      526000 calls < 3 secs , 0 calls > 3
      528000 calls < 3 secs , 0 calls > 3
      530000 calls < 3 secs , 0 calls > 3
      532000 calls < 3 secs , 0 calls > 3
      534000 calls < 3 secs , 0 calls > 3
      536000 calls < 3 secs , 0 calls > 3
      538000 calls < 3 secs , 0 calls > 3
      540000 calls < 3 secs , 0 calls > 3
      542000 calls < 3 secs , 0 calls > 3
      544000 calls < 3 secs , 0 calls > 3

      top - shows 2 processes - memcahed, beam of which memcached has 46 percent mem, and total cpu : 22 percent, similar across other regular functioning nodes too.
      ---------------------
      top - 10:17:59 up 10 days, 1:00, 1 user, load average: 4.50, 4.61, 4.79
      Tasks: 131 total, 2 running, 129 sleeping, 0 stopped, 0 zombie
      Cpu(s): 8.9%us, 2.8%sy, 0.0%ni, 60.6%id, 26.2%wa, 0.3%hi, 1.3%si, 0.0%st
      Mem: 10230648k total, 10080788k used, 149860k free, 209548k buffers
      Swap: 5210104k total, 104k used, 5210000k free, 4203332k cached

      PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
      11047 couchbas 25 0 548m 308m 2560 S 26.6 3.1 1053:08 beam.smp
      11174 couchbas 15 0 4847m 4.5g 4060 S 22.0 46.3 427:51.40 memcached
      736 root 10 -5 0 0 0 D 0.3 0.0 3:56.50 kjournald
      2906 root 15 0 46640 2896 2328 S 0.3 0.0 1:10.56 vmtoolsd

      iostat look ok.
      ---------------------
      Linux 2.6.18-308.el5 (cen-2109) 06/25/2012

      avg-cpu: %user %nice %system %iowait %steal %idle
      2.94 0.03 1.72 11.43 0.00 83.88

      Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
      sda 40.38 0.00 0.37 2447 319838
      sda1 0.00 0.00 0.00 1 0
      sda2 40.38 0.00 0.37 2446 319838
      sdb 38.41 0.01 0.27 5569 235017
      dm-0 94.57 0.00 0.37 2445 319839
      dm-1 0.00 0.00 0.00 0 0
      dm-2 70.38 0.01 0.27 5568 235017

      The node has 10 percent fragmentation,and is in 0% swap[UI].

      Bug - 2214 on 1.6 shows some similar behaviour.
      http://www.couchbase.com/issues/browse/MB-2214

      Attached is a screenshot from node 109

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            alkondratenko Aleksey Kondratenko (Inactive)
            ketaki Ketaki Gangal (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty