Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-33687

base_stats_collector crashing repeatedly

    XMLWordPrintable

Details

    • Untriaged
    • Centos 64-bit
    • Unknown

    Description

      On node 172.23.107.160, I see crash report for base_stats_collector:init/1 logged every second from 2019-04-05T04:17:38.998-07:00 to 2019-04-05T04:58:52.991-07:00 in ns_server.debug.log.
      This is causing a test that verifies bidirectional XDCR to timeout while waiting for stats.
      The failure is seen intermittently in build sanity (passed on build 2844). Just want to rule out any abnormal behavior.

      [error_logger:error,2019-04-05T04:58:52.991-07:00,ns_1@172.23.107.160:error_logger<0.32.0>:ale_error_logger_handler:do_log:203]
      =========================CRASH REPORT=========================
        crasher:
          initial call: base_stats_collector:init/1
          pid: <0.7797.5>
          registered_name: []
          exception error: no match of right hand side value <<"mean">>
            in function  stats_collector:parse_timing_range/1 (src/stats_collector.erl, line 317)
            in call from stats_collector:aggregate_timings/4 (src/stats_collector.erl, line 333)
            in call from stats_collector:parse_timings/4 (src/stats_collector.erl, line 344)
            in call from stats_collector:process_stats/5 (src/stats_collector.erl, line 144)
            in call from base_stats_collector:handle_info/2 (src/base_stats_collector.erl, line 95)
            in call from gen_server:try_dispatch/4 (gen_server.erl, line 616)
            in call from gen_server:handle_msg/6 (gen_server.erl, line 686)
          ancestors: ['single_bucket_kv_sup-default',ns_bucket_sup,
                        ns_bucket_worker_sup,ns_server_sup,ns_server_nodes_sup,
                        <0.194.0>,ns_server_cluster_sup,<0.118.0>]
          message_queue_len: 0
          messages: []
          links: [<0.7798.5>,<0.7799.5>,<0.7659.1>]
          dictionary: []
          trap_exit: false
          status: running
          heap_size: 75113
          stack_size: 27
          reductions: 18089
        neighbours:
      

      I also see this warning in babysitter.log around the time the crashes started:

      [ns_server:info,2019-04-05T04:17:26.258-07:00,babysitter_of_ns_1@127.0.0.1:<0.105.0>:ns_port_server:log:224]memcached<0.105.0>: 2019-04-05T04:17:26.057073-07:00 WARNING (default) Slow runtime for 'Updating stat snapshot on disk' on thread writer_worker_0: 305 ms
      

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-33687
          # Subject Branch Project Status CR V

          Activity

            People

              anitha.kuberan Anitha Kuberan
              pavithra.mahamani Pavithra Mahamani (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty