Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-43402

Prometheus shutdown logs ERROR REPORT and CRASH REPORT

    XMLWordPrintable

Details

    • Untriaged
    • 1
    • Unknown

    Description

      Each Prometheus shutdown also logs an ERROR REPORT and a CRASH REPORT message. Examples can be found in the logs-functional-ns_server/logs/n_1/debug.log in the tarballs attached to https://issues.couchbase.com/browse/MB-43391. (The tarball had to be split into "a" and "b" pieces arbitrarily to fit under Jira's 100 MB file size limit. Together they hold the complete set of logs from the test run.)

      Example – Prometheus says it is terminating gracefully, but error and crash reports follow indicating the shutdown was abnormal:

      [ns_server:error,2020-12-19T23:14:01.426+05:30,n_1@127.0.0.1:prometheus_cfg<0.2559.0>:prometheus_cfg:terminate:382]Terminate: shutdown
      [ns_server:debug,2020-12-19T23:14:01.427+05:30,n_1@127.0.0.1:prometheus_cfg<0.2559.0>:prometheus_cfg:terminate_prometheus:511]Terminating Prometheus gracefully
      [ns_server:debug,2020-12-19T23:14:01.427+05:30,n_1@127.0.0.1:<0.2571.0>:ns_pubsub:do_subscribe_link_continue:152]Parent process of subscription {user_storage_events,<0.2569.0>} exited with reason shutdown
      [ns_server:debug,2020-12-19T23:14:01.427+05:30,n_1@127.0.0.1:<0.2570.0>:ns_pubsub:do_subscribe_link_continue:152]Parent process of subscription {ns_config_events,<0.2569.0>} exited with reason shutdown
      [ns_server:debug,2020-12-19T23:14:01.420+05:30,n_1@127.0.0.1:<0.2617.0>:ns_pubsub:do_subscribe_link_continue:152]Parent process of subscription {buckets_events,<0.2614.0>} exited with reason shutdown
      [ns_server:debug,2020-12-19T23:14:01.520+05:30,n_1@127.0.0.1:prometheus-goport<0.2568.0>:goport:handle_eof:582]Stream 'stderr' closed
      [ns_server:debug,2020-12-19T23:14:01.520+05:30,n_1@127.0.0.1:prometheus-goport<0.2568.0>:goport:handle_eof:582]Stream 'stdout' closed
      [ns_server:info,2020-12-19T23:14:01.520+05:30,n_1@127.0.0.1:prometheus-goport<0.2568.0>:goport:handle_process_exit:563]Port exited with status 0.
      [error_logger:error,2020-12-19T23:14:01.521+05:30,n_1@127.0.0.1:<0.2562.0>:ale_error_logger_handler:do_log:107]
      =========================ERROR REPORT=========================
      ** Generic server <0.2562.0> terminating
      ** Last message in was {<0.2568.0>,{exit_status,0}}
      ** When Server state == {state,<0.2568.0>,
                                  {prometheus,"/opt/build/install/bin/prometheus",
                                      ["--config.file",
                                       "/opt/build/ns_server/data/n_1/config/prometheus.yml",
                                       "--web.enable-admin-api",
                                       "--web.enable-lifecycle",
                                       "--storage.tsdb.retention.size","1024MB",
                                       "--storage.tsdb.retention.time","365d",
                                       "--web.listen-address","127.0.0.1:9901",
                                       "--storage.tsdb.max-block-duration","25h",
                                       "--storage.tsdb.path",
                                       "/opt/build/ns_server/data/n_1/stats_data",
                                       "--log.level","debug","--query.max-samples",
                                       "200000","--storage.tsdb.no-lockfile",
                                       "--web.basicauth.config",
                                       "/opt/build/ns_server/data/n_1/config/prometheus_auth"],
                                      [via_goport,exit_status,stderr_to_stdout,
                                       {env,[]}]},
                                  {ringbuffer,1228,1024,
                                      {[{<<"level=info ts=2020-12-19T17:44:01.482Z caller=notifier.go:601 component=notifier msg=\"Stopping notification manager...\"\nlevel=info ts=2020-12-19T17:44:01.482Z caller=main.go:792 msg=\"Notifier manager stopped\"\nlevel=info ts=2020-12-19T17:44:01.482Z caller=main.go:804 msg=\"See you next time!\"\n">>,
                                         292}],
                                       [{<<"level=warn ts=2020-12-19T17:44:01.475Z caller=main.go:568 msg=\"Received termination request via web service, exiting gracefully...\"\nlevel=info ts=2020-12-19T17:44:01.475Z caller=main.go:588 msg=\"Stopping scrape discovery manager...\"\nlevel=info ts=2020-12-19T17:44:01.476Z caller=main.go:602 msg=\"Stopping notify discovery manager...\"\nlevel=info ts=2020-12-19T17:44:01.476Z caller=main.go:624 msg=\"Stopping scrape manager...\"\nlevel=info ts=2020-12-19T17:44:01.476Z caller=main.go:584 msg=\"Scrape discovery manager stopped\"\nlevel=info ts=2020-12-19T17:44:01.476Z caller=main.go:598 msg=\"Notify discovery manager stopped\"\nlevel=info ts=2020-12-19T17:44:01.476Z caller=manager.go:924 component=\"rule manager\" msg=\"Stopping rule manager...\"\nlevel=info ts=2020-12-19T17:44:01.476Z caller=manager.go:934 component=\"rule manager\" msg=\"Rule manager stopped\"\nlevel=info ts=2020-12-19T17:44:01.476Z caller=main.go:618 msg=\"Scrape manager stopped\"\n">>,
                                         936}]}},
                                  prometheus,undefined,[],0}
      ** Reason for termination ==
      ** {abnormal,0}
       
      [error_logger:error,2020-12-19T23:14:01.523+05:30,n_1@127.0.0.1:<0.2562.0>:ale_error_logger_handler:do_log:107]
      =========================CRASH REPORT=========================
        crasher:
          initial call: ns_port_server:init/1
          pid: <0.2562.0>
          registered_name: []
          exception exit: {abnormal,0}
            in function  gen_server:handle_common_reply/8 (gen_server.erl, line 751)
          ancestors: [prometheus_cfg,ns_server_sup,ns_server_nodes_sup,<0.2438.0>,
                        ns_server_cluster_sup,root_sup,<0.139.0>]
          message_queue_len: 1
          messages: [{'EXIT',<0.2568.0>,normal}]
          links: [<0.2559.0>]
          dictionary: []
          trap_exit: true
          status: running
          heap_size: 10958
          stack_size: 27
          reductions: 26389
        neighbours:[ns_server:debug,2020-12-19T23:14:01.525+05:30,n_1@127.0.0.1:prometheus_cfg<0.2559.0>:prometheus_cfg:terminate_prometheus:532]Prometheus port server stopped successfully
      [ns_server:debug,2020-12-19T23:14:01.525+05:30,n_1@127.0.0.1:<0.2560.0>:ns_pubsub:do_subscribe_link_continue:152]Parent process of subscription {ns_config_events,<0.2559.0>} exited with reason shutdown
      [ns_server:debug,2020-12-19T23:14:01.528+05:30,n_1@127.0.0.1:ns_couchdb_port<0.2505.0>:ns_port_server:terminate:196]Shutting down port ns_couchdb
      [ns_server:debug,2020-12-19T23:14:01.529+05:30,n_1@127.0.0.1:<0.2550.0>:remote_monitors:handle_down:158]Caller of remote monitor <0.2506.0> died with shutdown. Exiting
      [ns_server:debug,2020-12-19T23:14:01.529+05:30,n_1@127.0.0.1:ns_couchdb_port<0.2505.0>:ns_port_server:port_shutdown:297]Shutdown command: "shutdown"
      

      Attachments

        For Gerrit Dashboard: MB-43402
        # Subject Branch Project Status CR V

        Activity

          People

            timofey.barmin Timofey Barmin
            kevin.cherkauer Kevin Cherkauer (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty