Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-8380

[system test] On a full disk node where persisting fails, disk queue drain rate shown in UI is high (20k) which is not correct

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • 3.0
    • 2.1.0
    • couchbase-bucket
    • Security Level: Public
    • None
    • 2.0.2-807-rel
    • Centos 64-bit

    Description

      Cluster ip is 172.23.105.23
      1. create 8 nodes cluster, each node has 12G RAM, HHD
      2. create 2 buckets default and saslbucket, with memory quota 6G and 4G
      3. Run the KV load continuously.
      4. login to one of the node in the cluster. Create some very huge 360G file under /data which is the data path for that node to consume all the disk space.

      If the disk usage is reaching 100%, ns_server pop out the warning message about disk full usage and "Write Commit Failure. Disk write failed for item in Bucket "default" on node 172.23.105.27." message. This is good.
      But for the stats on UI, Disk Queues drain rate reaches more than 20K, which must be fake while items in Disk Write queue keeps increasing.

      root@cola-s10309:~# /opt/couchbase/bin/cbstats localhost:11210 all | egrep disk
      ep_diskqueue_drain: 111121577
      ep_diskqueue_fill: 56708900
      ep_diskqueue_items: 311206
      ep_diskqueue_memory: 58496
      ep_diskqueue_pending: 79039

      This is because every time we re-queue the persist request for dirty items which is failed for the previous try, we just simply add the count for ep_diskqueue_drain.
      ep-engine stats treat write commit failure as an incremental count for ep_diskqueue_drain stats:

      root@cola-s10309:~# tail /opt/couchbase/var/lib/couchbase/logs/memcached.log.184.txt
      Thu May 30 18:35:46.295403 PDT 3: (default) Fatal error in persisting SET ``pymc679617'' on vb 628!!! Requeue it...
      Thu May 30 18:35:46.295421 PDT 3: (default) Fatal error in persisting SET ``pymc679787'' on vb 628!!! Requeue it...
      Thu May 30 18:35:46.295440 PDT 3: (default) Fatal error in persisting SET ``pymc68236'' on vb 628!!! Requeue it...
      Thu May 30 18:35:46.295456 PDT 3: (default) Fatal error in persisting SET ``pymc68544'' on vb 628!!! Requeue it...
      Thu May 30 18:35:46.295468 PDT 3: (default) Fatal error in persisting SET ``pymc689728'' on vb 628!!! Requeue it...
      Thu May 30 18:35:46.295480 PDT 3: (default) Fatal error in persisting SET ``pymc690236'' on vb 628!!! Requeue it...
      Thu May 30 18:35:46.295505 PDT 3: (default) Fatal error in persisting SET ``pymc690544'' on vb 628!!! Requeue it...
      Thu May 30 18:35:46.295516 PDT 3: (default) Fatal error in persisting SET ``pymc692883'' on vb 628!!! Requeue it...
      Thu May 30 18:35:46.295526 PDT 3: (default) Fatal error in persisting SET ``pymc692913'' on vb 628!!! Requeue it...
      Thu May 30 18:35:46.295536 PDT 3: (default) Fatal error in persisting SET ``pymc693017'' on vb 628!!! Requeue it...

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            tommie Tommie McAfee (Inactive)
            Chisheng Chisheng Hong (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty