Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-7319

"metadata overhead warning" alert doesn't seem to be calculated correctly

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.0
    • Fix Version/s: 2.0.1
    • Component/s: ns_server
    • Security Level: Public
    • Labels:
      None
    • Environment:

      Description

      "metadata overhead warning" alert doesn't seem to be calculated correctly

      The calculation for metadata overhead warning is done with (ep_overhead/ep_max_data_size) * 100
      (Ref - https://github.com/couchbase/ns_server/blob/master/src/menelaus_web_alerts_srv.erl#L274)

      As per documentation(stats.org):
      ep_overhead - Extra memory used by transient data like persistence queues, replication queues, checkpoints, etc
      ep_max_data_size - Max amount of data allowed in memory

      So ep_overhead doesn't seem to be the right stat to use here as we want to calculate Metadata overhead.

      So on a node with most of the data in RAM being metadata e.g

      bash> ./cbstats 10.3.3.95:11210 all | grep mem
      ep_diskqueue_memory: 0
      ep_mem_high_wat: 805306368
      ep_mem_low_wat: 644245094
      ep_mem_tracker_enabled: true
      ep_mutation_mem_threshold: 0
      ep_warmup_min_memory_threshold: 100
      mem_used: 949808672
      vb_active_ht_memory: 25407488
      vb_active_itm_memory: 346355534
      vb_active_meta_data_memory: 334585930
      vb_active_perc_mem_resident: 0
      vb_active_queue_memory: 0
      vb_pending_ht_memory: 0
      vb_pending_itm_memory: 0
      vb_pending_meta_data_memory: 0
      vb_pending_perc_mem_resident: 0
      vb_pending_queue_memory: 0
      vb_replica_ht_memory: 25407488
      vb_replica_itm_memory: 348684429
      vb_replica_meta_data_memory: 336858898
      vb_replica_perc_mem_resident: 0
      vb_replica_queue_memory: 0

      The Metadata overhead warning is not generated. The stats used in calculation are:

      bash> ./cbstats 10.3.3.95:11210 all | grep ep_overhead
      ep_overhead: 52325504
      bash> ./cbstats 10.3.3.95:11210 all | grep ep_max_data_size
      ep_max_data_size: 1073741824

      No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

        Hide
        alkondratenko Aleksey Kondratenko (Inactive) added a comment -

        Chiyoung, please review and let me know if we should use some other formula.

        I.e. things could have change since initial implementation of alerts and we indeed may want to revise overhead formula.

        Show
        alkondratenko Aleksey Kondratenko (Inactive) added a comment - Chiyoung, please review and let me know if we should use some other formula. I.e. things could have change since initial implementation of alerts and we indeed may want to revise overhead formula.
        Hide
        chiyoung Chiyoung Seo added a comment -

        We already opened the bug for this issue:

        http://www.couchbase.com/issues/browse/MB-7218

        Show
        chiyoung Chiyoung Seo added a comment - We already opened the bug for this issue: http://www.couchbase.com/issues/browse/MB-7218
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        if i understand correctly MB-7128 is about a stat not being updated on ep-engine side and this bug is the forumla ns-server uses for raising the alert.

        i would like to keep these two bugs seperate and once 7218 is fixed we can revisit this bug.

        reopening and assigning this to Deep

        Show
        farshid Farshid Ghods (Inactive) added a comment - if i understand correctly MB-7128 is about a stat not being updated on ep-engine side and this bug is the forumla ns-server uses for raising the alert. i would like to keep these two bugs seperate and once 7218 is fixed we can revisit this bug. reopening and assigning this to Deep
        Hide
        dipti Dipti Borkar added a comment -

        Is this a regression?

        Show
        dipti Dipti Borkar added a comment - Is this a regression?
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        it is mentioned MB-7128 that it is not a regression

        Show
        farshid Farshid Ghods (Inactive) added a comment - it is mentioned MB-7128 that it is not a regression
        Hide
        farshid Farshid Ghods (Inactive) added a comment -

        Deep,

        the latest build has this fix, can you please verify ?

        Show
        farshid Farshid Ghods (Inactive) added a comment - Deep, the latest build has this fix, can you please verify ?
        Hide
        deepkaran.salooja Deepkaran Salooja added a comment -

        Verified with build 2.0.1-124-rel. Metadata alert "Metadata overhead warning. Over 50% of RAM allocated to bucket "default" on node "10.3.3.95" is taken up by keys and metadata.",
        is being correctly generated when ep_meta_data_memory is more than 50% of ep_max_data_size.

        Alert was generated in both the below cases:

        Without replicas(bucket size = 1024MB, Nodes = 1)

        root@ubuntu1104-64:/opt/couchbase/bin# ./cbstats 10.3.3.95:11210 all | grep max_data
        ep_max_data_size: 1073741824
        root@ubuntu1104-64:/opt/couchbase/bin# ./cbstats 10.3.3.95:11210 all | grep mem
        ep_diskqueue_memory: 0
        ep_mem_high_wat: 805306368
        ep_mem_low_wat: 644245094
        ep_mem_tracker_enabled: true
        ep_meta_data_memory: 555237786
        ep_mutation_mem_threshold: 90
        ep_warmup_min_memory_threshold: 100
        mem_used: 949942880
        vb_active_ht_memory: 50398208
        vb_active_itm_memory: 782414731
        vb_active_meta_data_memory: 555237786
        vb_active_perc_mem_resident: 4
        vb_active_queue_memory: 0
        vb_pending_ht_memory: 0
        vb_pending_itm_memory: 0
        vb_pending_meta_data_memory: 0
        vb_pending_perc_mem_resident: 0
        vb_pending_queue_memory: 0
        vb_replica_ht_memory: 0
        vb_replica_itm_memory: 0
        vb_replica_meta_data_memory: 0
        vb_replica_perc_mem_resident: 0
        vb_replica_queue_memory: 0

        With 1 replica (bucket size = 1024MB, Nodes = 2)

        bash> ./cbstats 10.3.3.95:11210 all | grep mem
        ep_diskqueue_memory: 960512
        ep_mem_high_wat: 805306368
        ep_mem_low_wat: 644245094
        ep_mem_tracker_enabled: true
        ep_meta_data_memory: 540306900
        ep_mutation_mem_threshold: 90
        ep_warmup_min_memory_threshold: 100
        mem_used: 920098192
        vb_active_ht_memory: 25199104
        vb_active_itm_memory: 395596710
        vb_active_meta_data_memory: 270034632
        vb_active_perc_mem_resident: 4
        vb_active_queue_memory: 619680
        vb_pending_ht_memory: 0
        vb_pending_itm_memory: 0
        vb_pending_meta_data_memory: 0
        vb_pending_perc_mem_resident: 0
        vb_pending_queue_memory: 0
        vb_replica_ht_memory: 25199104
        vb_replica_itm_memory: 351621545
        vb_replica_meta_data_memory: 270272268
        vb_replica_perc_mem_resident: 2
        vb_replica_queue_memory: 340832
        bash> ./cbstats 10.3.3.95:11210 all | grep max_data
        ep_max_data_size: 1073741824

        Show
        deepkaran.salooja Deepkaran Salooja added a comment - Verified with build 2.0.1-124-rel. Metadata alert "Metadata overhead warning. Over 50% of RAM allocated to bucket "default" on node "10.3.3.95" is taken up by keys and metadata.", is being correctly generated when ep_meta_data_memory is more than 50% of ep_max_data_size. Alert was generated in both the below cases: Without replicas(bucket size = 1024MB, Nodes = 1) root@ubuntu1104-64:/opt/couchbase/bin# ./cbstats 10.3.3.95:11210 all | grep max_data ep_max_data_size: 1073741824 root@ubuntu1104-64:/opt/couchbase/bin# ./cbstats 10.3.3.95:11210 all | grep mem ep_diskqueue_memory: 0 ep_mem_high_wat: 805306368 ep_mem_low_wat: 644245094 ep_mem_tracker_enabled: true ep_meta_data_memory: 555237786 ep_mutation_mem_threshold: 90 ep_warmup_min_memory_threshold: 100 mem_used: 949942880 vb_active_ht_memory: 50398208 vb_active_itm_memory: 782414731 vb_active_meta_data_memory: 555237786 vb_active_perc_mem_resident: 4 vb_active_queue_memory: 0 vb_pending_ht_memory: 0 vb_pending_itm_memory: 0 vb_pending_meta_data_memory: 0 vb_pending_perc_mem_resident: 0 vb_pending_queue_memory: 0 vb_replica_ht_memory: 0 vb_replica_itm_memory: 0 vb_replica_meta_data_memory: 0 vb_replica_perc_mem_resident: 0 vb_replica_queue_memory: 0 With 1 replica (bucket size = 1024MB, Nodes = 2) bash> ./cbstats 10.3.3.95:11210 all | grep mem ep_diskqueue_memory: 960512 ep_mem_high_wat: 805306368 ep_mem_low_wat: 644245094 ep_mem_tracker_enabled: true ep_meta_data_memory: 540306900 ep_mutation_mem_threshold: 90 ep_warmup_min_memory_threshold: 100 mem_used: 920098192 vb_active_ht_memory: 25199104 vb_active_itm_memory: 395596710 vb_active_meta_data_memory: 270034632 vb_active_perc_mem_resident: 4 vb_active_queue_memory: 619680 vb_pending_ht_memory: 0 vb_pending_itm_memory: 0 vb_pending_meta_data_memory: 0 vb_pending_perc_mem_resident: 0 vb_pending_queue_memory: 0 vb_replica_ht_memory: 25199104 vb_replica_itm_memory: 351621545 vb_replica_meta_data_memory: 270272268 vb_replica_perc_mem_resident: 2 vb_replica_queue_memory: 340832 bash> ./cbstats 10.3.3.95:11210 all | grep max_data ep_max_data_size: 1073741824

          People

          • Assignee:
            deepkaran.salooja Deepkaran Salooja
            Reporter:
            deepkaran.salooja Deepkaran Salooja
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Gerrit Reviews

              There are no open Gerrit changes