Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-16826

from time to time failed to restart CB

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Test Blocker
    • 3.1.3, 4.1.1, 4.5.0
    • 4.5.0
    • ns_server
    • Security Level: Public
    • None
    • 4.5.0-611
    • Untriaged
    • Unknown

    Description

      most likely the problem is a consequence of MB-16696

      sometimes when I try to restart CB I get:

      [root@hera02-s815 logs]# /etc/init.d/couchbase-server status
      couchbase-server is running
      [root@hera02-s815 logs]# ps -ef| grep couch
      root 68411 1 0 14:36 ? 00:00:00 /opt/couchbase/lib/erlang/erts-5.10.4.0.0.1/bin/epmd -daemon
      500 68702 1 0 14:36 ? 00:00:01 /opt/couchbase/lib/erlang/erts-5.10.4.0.0.1/bin/beam.smp -A 16 – -root /opt/couchbase/lib/erlang -progname erl – -home /home/couchbase – -smp enable -kernel inet_dist_listen_min 21100 inet_dist_listen_max 21299 error_logger false -sasl sasl_error_logger false -hidden -name babysitter_of_ns_1@127.0.0.1 -setcookie nocookie – -noshell -noinput -noshell -noinput -run ns_babysitter_bootstrap – -couch_ini /opt/couchbase/etc/couchdb/default.ini /opt/couchbase/etc/couchdb/default.d/capi.ini /opt/couchbase/etc/couchdb/default.d/geocouch.ini /opt/couchbase/etc/couchdb/local.ini -ns_babysitter cookiefile "/opt/couchbase/var/lib/couchbase/couchbase-server.cookie" -ns_server config_path "/opt/couchbase/etc/couchbase/static_config" -ns_server pidfile "/opt/couchbase/var/lib/couchbase/couchbase-server.pid" -ns_server cookiefile "/opt/couchbase/var/lib/couchbase/couchbase-server.cookie-ns-server" -ns_server enable_mlockall false
      500 68761 68702 1 14:36 ? 00:00:03 /opt/couchbase/lib/erlang/erts-5.10.4.0.0.1/bin/beam.smp -A 16 -sbt u -P 327680 -K true -swt low -MMmcs 30 -e102400 – -root /opt/couchbase/lib/erlang -progname erl – -home /home/couchbase – -smp enable -setcookie nocookie -kernel inet_dist_listen_min 21100 inet_dist_listen_max 21299 error_logger false -sasl sasl_error_logger false -nouser -run child_erlang child_start ns_bootstrap – -smp enable -couch_ini /opt/couchbase/etc/couchdb/default.ini /opt/couchbase/etc/couchdb/default.d/capi.ini /opt/couchbase/etc/couchdb/default.d/geocouch.ini /opt/couchbase/etc/couchdb/local.ini
      500 68821 68761 0 14:36 ? 00:00:00 /opt/couchbase/lib/erlang/lib/os_mon-2.2.14/priv/bin/memsup
      500 68822 68761 0 14:36 ? 00:00:00 /opt/couchbase/lib/erlang/lib/os_mon-2.2.14/priv/bin/cpu_sup
      root 68844 68841 0 14:36 ? 00:00:00 /bin/sh /etc/init.d/couchbase-server restart
      root 68863 68844 0 14:36 ? 00:00:00 /bin/bash -c ulimit -S -c unlimited >/dev/null 2>&1 ; /opt/couchbase/bin/couchbase-server -k
      root 68864 68863 0 14:36 ? 00:00:00 /bin/bash /opt/couchbase/bin/couchbase-server -k
      root 68870 68864 0 14:36 ? 00:00:00 /opt/couchbase/lib/erlang/erts-5.10.4.0.0.1/bin/beam.smp – -root /opt/couchbase/lib/erlang -progname erl – -home /root – -name executioner@executioner -noshell -hidden -setcookie rjzuykgwxweavuwf -eval ns_babysitter_bootstrap:remote_stop('babysitter_of_ns_1@127.0.0.1')
      root 69097 69025 0 14:40 pts/0 00:00:00 grep couch

      [root@hera02-s815 logs]# tail -n 100 debug.log
      [ns_server:info,2015-11-14T14:36:56.881-08:00,ns_1@127.0.0.1:ns_ssl_services_setup<0.159.0>:ns_ssl_services_setup:do_generate_local_cert:432]Saved local cert for node 'ns_1@127.0.0.1'
      [ns_server:debug,2015-11-14T14:36:56.883-08:00,ns_1@127.0.0.1:<0.353.0>:ns_pubsub:do_subscribe_link:145]Parent process of subscription

      {ns_stats_event,<0.352.0>}

      exited with reason shutdown
      [ns_server:info,2015-11-14T14:36:56.885-08:00,ns_1@127.0.0.1:ns_ssl_services_setup<0.159.0>:ns_ssl_services_setup:handle_info:379]Wrote new pem file
      [ns_server:debug,2015-11-14T14:36:56.885-08:00,ns_1@127.0.0.1:<0.164.0>:restartable:loop:71]Restarting child <0.165.0>
      MFA:

      {ns_ssl_services_setup,start_link_rest_service,[]}

      Shutdown policy: 1000
      Caller:

      {<0.390.0>,#Ref<0.0.0.1458>}

      [ns_server:debug,2015-11-14T14:36:56.885-08:00,ns_1@127.0.0.1:<0.392.0>:ns_ports_manager:restart_port_by_name:43]Requesting restart of port xdcr_proxy
      [ns_server:warn,2015-11-14T14:36:56.886-08:00,ns_1@127.0.0.1:<0.394.0>:ns_memcached:connect:1290]Unable to connect: {error,{badmatch,

      {error,econnrefused}

      }}, retrying.
      [ns_server:debug,2015-11-14T14:36:56.888-08:00,ns_1@127.0.0.1:<0.350.0>:ns_pubsub:do_subscribe_link:145]Parent process of subscription

      {ns_stats_event,<0.349.0>}

      exited with reason shutdown
      [ns_server:debug,2015-11-14T14:36:56.888-08:00,ns_1@127.0.0.1:<0.348.0>:ns_pubsub:do_subscribe_link:145]Parent process of subscription

      {ns_tick_event,<0.345.0>}

      exited with reason shutdown
      [ns_server:debug,2015-11-14T14:36:56.888-08:00,ns_1@127.0.0.1:<0.347.0>:ns_pubsub:do_subscribe_link:145]Parent process of subscription

      {ale_stats_events,<0.345.0>}

      exited with reason shutdown
      [ns_server:debug,2015-11-14T14:36:56.888-08:00,ns_1@127.0.0.1:<0.344.0>:ns_pubsub:do_subscribe_link:145]Parent process of subscription

      {ns_config_events,<0.343.0>}

      exited with reason shutdown
      [error_logger:error,2015-11-14T14:36:56.889-08:00,ns_1@127.0.0.1:error_logger<0.6.0>:ale_error_logger_handler:do_log:203]
      =========================SUPERVISOR REPORT=========================
      Supervisor:

      {local,ns_bucket_sup}

      Context: shutdown_error
      Reason: normal
      Offender: [

      {pid,<0.344.0>}

      ,

      {name,buckets_observing_subscription}

      ,
      {mfargs,{ns_bucket_sup,subscribe_on_config_events,[]}},

      {restart_type,permanent},
      {shutdown,1000},
      {child_type,worker}]


      [ns_server:debug,2015-11-14T14:36:56.889-08:00,ns_1@127.0.0.1:<0.333.0>:ns_pubsub:do_subscribe_link:145]Parent process of subscription {ns_config_events,<0.332.0>} exited with reason shutdown
      [ns_server:debug,2015-11-14T14:36:56.889-08:00,ns_1@127.0.0.1:<0.317.0>:ns_pubsub:do_subscribe_link:145]Parent process of subscription {ns_config_events,<0.316.0>} exited with reason killed
      [ns_server:debug,2015-11-14T14:36:56.889-08:00,ns_1@127.0.0.1:<0.320.0>:ns_pubsub:do_subscribe_link:145]Parent process of subscription {ns_config_events,<0.319.0>} exited with reason shutdown
      [ns_server:debug,2015-11-14T14:36:56.889-08:00,ns_1@127.0.0.1:<0.314.0>:ns_pubsub:do_subscribe_link:145]Parent process of subscription {ns_node_disco_events,<0.312.0>} exited with reason shutdown
      [ns_server:debug,2015-11-14T14:36:56.889-08:00,ns_1@127.0.0.1:<0.313.0>:ns_pubsub:do_subscribe_link:145]Parent process of subscription {json_rpc_events,<0.312.0>} exited with reason shutdown
      [ns_server:debug,2015-11-14T14:36:56.889-08:00,ns_1@127.0.0.1:<0.315.0>:ns_pubsub:do_subscribe_link:145]Parent process of subscription {ns_config_events,<0.312.0>} exited with reason shutdown
      [ns_server:debug,2015-11-14T14:36:56.890-08:00,ns_1@127.0.0.1:<0.286.0>:ns_pubsub:do_subscribe_link:145]Parent process of subscription {ns_config_events,<0.285.0>} exited with reason shutdown
      [ns_server:debug,2015-11-14T14:36:56.890-08:00,ns_1@127.0.0.1:<0.281.0>:ns_pubsub:do_subscribe_link:145]Parent process of subscription {master_activity_events,<0.280.0>} exited with reason killed
      [ns_server:info,2015-11-14T14:36:56.890-08:00,ns_1@127.0.0.1:mb_master<0.256.0>:mb_master:terminate:299]Synchronously shutting down child mb_master_sup
      [ns_server:debug,2015-11-14T14:36:56.890-08:00,ns_1@127.0.0.1:<0.257.0>:ns_pubsub:do_subscribe_link:145]Parent process of subscription {ns_config_events,<0.256.0>} exited with reason shutdown
      [ns_server:debug,2015-11-14T14:36:56.890-08:00,ns_1@127.0.0.1:<0.249.0>:ns_pubsub:do_subscribe_link:145]Parent process of subscription {ns_config_events,<0.248.0>} exited with reason shutdown
      [ns_server:debug,2015-11-14T14:36:56.890-08:00,ns_1@127.0.0.1:<0.242.0>:ns_pubsub:do_subscribe_link:145]Parent process of subscription {buckets_events,<0.241.0>} exited with reason shutdown
      [ns_server:debug,2015-11-14T14:36:56.891-08:00,ns_1@127.0.0.1:<0.233.0>:ns_pubsub:do_subscribe_link:145]Parent process of subscription {ns_config_events,<0.231.0>} exited with reason killed
      [ns_server:debug,2015-11-14T14:36:56.891-08:00,ns_1@127.0.0.1:<0.230.0>:ns_pubsub:do_subscribe_link:145]Parent process of subscription {ns_config_events,<0.229.0>} exited with reason killed
      [ns_server:debug,2015-11-14T14:36:56.891-08:00,ns_1@127.0.0.1:<0.222.0>:ns_pubsub:do_subscribe_link:145]Parent process of subscription {ns_config_events_local,<0.221.0>} exited with reason shutdown
      [ns_server:debug,2015-11-14T14:36:56.891-08:00,ns_1@127.0.0.1:<0.209.0>:ns_pubsub:do_subscribe_link:145]Parent process of subscription {ns_config_events,<0.208.0>} exited with reason shutdown
      [ns_server:debug,2015-11-14T14:36:56.892-08:00,ns_1@127.0.0.1:<0.200.0>:remote_monitors:handle_down:158]Caller of remote monitor <0.183.0> died with shutdown. Exiting
      [ns_server:debug,2015-11-14T14:36:56.892-08:00,ns_1@127.0.0.1:ns_couchdb_port<0.182.0>:ns_port_server:terminate:182]Sending shutdown to port ns_couchdb
      [error_logger:error,2015-11-14T14:36:56.895-08:00,ns_1@127.0.0.1:error_logger<0.6.0>:ale_error_logger_handler:do_log:203]
      =========================CRASH REPORT=========================
      crasher:
      initial call: gen_event:init_it/6
      pid: <0.232.0>
      registered_name: bucket_info_cache_invalidations
      exception exit: killed
      in function gen_event:terminate_server/4 (gen_event.erl, line 320)
      ancestors: [bucket_info_cache,ns_server_sup,ns_server_nodes_sup,
      <0.153.0>,ns_server_cluster_sup,<0.89.0>]
      messages: []
      links: []
      dictionary: []
      trap_exit: true
      status: running
      heap_size: 376
      stack_size: 27
      reductions: 147
      neighbours:

      [error_logger:info,2015-11-14T14:36:56.901-08:00,ns_1@127.0.0.1:error_logger<0.6.0>:ale_error_logger_handler:do_log:203]
      =========================INFO REPORT=========================
      {net_kernel,{'EXIT',<0.192.0>,connection_closed}}
      [ns_server:debug,2015-11-14T14:36:56.902-08:00,ns_1@127.0.0.1:ns_couchdb_port<0.182.0>:ns_port_server:terminate:185]ns_couchdb has exited
      [ns_server:info,2015-11-14T14:36:56.902-08:00,ns_1@127.0.0.1:ns_couchdb_port<0.182.0>:ns_port_server:log:210]ns_couchdb<0.182.0>: 68934: got shutdown request. Exiting
      ns_couchdb<0.182.0>: [os_mon] cpu supervisor port (cpu_sup): Erlang has closed
      ns_couchdb<0.182.0>: [os_mon] memory supervisor port (memsup): Erlang has closed

      [error_logger:error,2015-11-14T14:37:56.886-08:00,ns_1@127.0.0.1:error_logger<0.6.0>:ale_error_logger_handler:do_log:203]** Generic server ns_ssl_services_setup terminating
      ** Last message in was notify_services
      ** When Server state == {}
      ** Reason for termination ==
      ** timeout

      [ns_server:debug,2015-11-14T14:37:56.886-08:00,ns_1@127.0.0.1:<0.160.0>:ns_pubsub:do_subscribe_link:145]Parent process of subscription {ns_config_events,<0.159.0>} exited with reason timeout
      [error_logger:error,2015-11-14T14:37:56.887-08:00,ns_1@127.0.0.1:error_logger<0.6.0>:ale_error_logger_handler:do_log:203]
      =========================CRASH REPORT=========================
      crasher:
      initial call: ns_ssl_services_setup:init/1
      pid: <0.159.0>
      registered_name: ns_ssl_services_setup
      exception exit: timeout
      in function gen_server:terminate/6 (gen_server.erl, line 744)
      ancestors: [ns_ssl_services_sup,ns_server_nodes_sup,<0.153.0>,
      ns_server_cluster_sup,<0.89.0>]
      messages: []
      links: [<0.158.0>,<0.160.0>]
      dictionary: [{ssl_manager,ssl_manager}]
      trap_exit: false
      status: running
      heap_size: 6772
      stack_size: 27
      reductions: 194708
      neighbours:

      [root@hera02-s815 couchbase]# /etc/init.d/couchbase-server restart
      Stopping couchbase-server
      {error_logger,2015,11,14},{14,46,39,"Protocol: ~tp: the name executioner@executioner seems to be in use by another Erlang node",["inet_tcp"]}
      {error_logger,2015,11,14},{14,46,39,crash_report,[[{initial_call,{net_kernel,init,['Argument__1']}},{pid,<0.21.0>},{registered_name,[]},{error_info,{exit,{error,badarg},[{gen_server,init_it,6,[{file,"gen_server.erl"},{line,320}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}},{ancestors,[net_sup,kernel_sup,<0.10.0>]},{messages,[]},{links,[#Port<0.53>,<0.18.0>]},{dictionary,[{longnames,true}]},{trap_exit,true},{status,running},{heap_size,610},{stack_size,27},{reductions,768}],[]]}
      {error_logger,2015,11,14},{14,46,39,supervisor_report,[{supervisor,{local,net_sup}},{errorContext,start_error},{reason,{'EXIT',nodistribution}},{offender,[{pid,undefined},{name,net_kernel},{mfargs,{net_kernel,start_link,[['executioner@executioner',longnames]]}},{restart_type,permanent}

      ,

      {shutdown,2000}

      ,

      {child_type,worker}

      ]}]}
      {error_logger,2015,11,14},{14,46,39,supervisor_report,[{supervisor,{local,kernel_sup}},

      {errorContext,start_error}

      ,{reason,{shutdown,{failed_to_start_child,net_kernel,{'EXIT',nodistribution}}}},{offender,[

      {pid,undefined}

      ,

      {name,net_sup}

      ,{mfargs,{erl_distribution,start_link,[]}},

      {restart_type,permanent}

      ,

      {shutdown,infinity}

      ,

      {child_type,supervisor}

      ]}]}
      {error_logger,2015,11,14},{14,46,39,crash_report,[[{initial_call,{application_master,init,['Argument__1','Argument__2','Argument__3','Argument__4']}},

      {pid,<0.9.0>}

      ,

      {registered_name,[]}

      ,{error_info,{exit,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,net_kernel,

      {'EXIT',nodistribution}}}}},{kernel,start,[normal,[]]}},[{application_master,init,4,[{file,"application_master.erl"},{line,133}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}},{ancestors,[<0.8.0>]},{messages,[{'EXIT',<0.10.0>,normal}]},{links,[<0.8.0>,<0.7.0>]},{dictionary,[]},{trap_exit,true},{status,running},{heap_size,376},{stack_size,27},{reductions,117}],[]]}
      {error_logger,2015,11,14},{14,46,39,std_info,[{application,kernel},{exited,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,net_kernel,{'EXIT',nodistribution}

      }}}},

      {kernel,start,[normal,[]]}}},{type,permanent}]}
      {"Kernel pid terminated",application_controller,"{application_start_failure,kernel,shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,net_kernel,{'EXIT',nodistribution}}},{kernel,start,[normal,[]]}

      }}"}

      Crash dump was written to: erl_crash.dump.1447541199.69177.babysitter
      Kernel pid terminated (application_controller) ({application_start_failure,kernel,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,net_kernel,

      {'EXIT',nodistribution}

      }}}},{k
      [FAILED]
      Starting couchbase-server

      Attachments

        Issue Links

          For Gerrit Dashboard: MB-16826
          # Subject Branch Project Status CR V

          Activity

            People

              Aliaksey Artamonau Aliaksey Artamonau (Inactive)
              andreibaranouski Andrei Baranouski
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty