Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-49876

addNode fails because of the unknown exception in error_logger on ns_couchdb node

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 7.1.0
    • 7.1.0
    • ns_server
    • None
    • Untriaged
    • 1
    • Unknown

    Description

      spotted on jenkins while running simple-test

      11:20:04 b'2021-12-02 19:20:04 | ERROR | MainProcess | Cluster_Thread | [rest_client._http_request] POST http://127.0.0.1:9000/controller/addNode body: hostname=https%3A%2F%2F127.0.0.1%3A19001&user=Administrator&password=asdasd headers: {\'Content-Type\': \'application/x-www-form-urlencoded\', \'Authorization\': \'Basic QWRtaW5pc3RyYXRvcjphc2Rhc2Q=\', \'Accept\': \'*/*\'} error: 500 reason: unknown b\'["Unexpected server error, request logged."]\' auth: Administrator:asdasd'
      

      =========================ERROR REPORT=========================
      ** Generic server ns_cluster terminating 
      ** Last message in was {add_node_to_group,https,"127.0.0.1",19001,
                                 {"Administrator","asdasd"},
                                 undefined,
                                 [kv]}
      ** When Server state == {state}
      ** Reason for termination ==
      ** {{{error,wait_for_node_failed},
           {gen_server,call,
                       [dist_manager,
                        {adjust_my_address,"127.0.0.1",false,
                                           #Fun<ns_cluster.9.33945072>},
                        infinity]}},
          [{gen_server,call,3,[{file,"gen_server.erl"},{line,247}]},
           {ns_cluster,maybe_rename,2,[{file,"src/ns_cluster.erl"},{line,774}]},
           {ns_cluster,do_change_address,2,[{file,"src/ns_cluster.erl"},{line,750}]},
           {ns_cluster,do_add_node_allowed,6,
                       [{file,"src/ns_cluster.erl"},{line,822}]},
           {ns_cluster,handle_call,3,[{file,"src/ns_cluster.erl"},{line,444}]},
           {gen_server,try_handle_call,4,[{file,"gen_server.erl"},{line,721}]},
           {gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,750}]},
           {proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}
      ** Client <0.4593.0> stacktrace
      ** [{gen,do_call,4,[{file,"gen.erl"},{line,233}]},
          {gen_server,call,3,[{file,"gen_server.erl"},{line,243}]},
          {ns_cluster,add_node_to_group,6,[{file,"src/ns_cluster.erl"},{line,80}]},
          {menelaus_web_cluster,'-do_handle_add_node/2-fun-1-',8,
                                [{file,"src/menelaus_web_cluster.erl"},{line,736}]},
          {menelaus_util,survive_web_server_restart,1,
                         [{file,"src/menelaus_util.erl"},{line,742}]},
          {request_tracker,request,2,[{file,"src/request_tracker.erl"},{line,40}]},
          {menelaus_util,handle_request,2,
                         [{file,"src/menelaus_util.erl"},{line,221}]},
          {mochiweb_http,headers,6,
                         [{file,"/home/couchbase/jenkins/workspace/ns-server-simple-test/couchdb/src/mochiweb/mochiweb_http.erl"},
                          {line,150}]}]
      

      ns_couchdb node was shut down after rename leaving the following in stderr:

      [ns_server:info,2021-12-02T19:20:03.309Z,n_0@127.0.0.1:ns_couchdb_port<0.376.0>:ns_port_server:log:226]ns_couchdb<0.376.0>: {removed_failing_handler,error_logger}
       
      [error_logger:error,2021-12-02T19:20:03.838Z,n_0@127.0.0.1:ns_couchdb_port<0.376.0>:ale_error_logger_handler:do_log:101]
      =========================ERROR REPORT=========================
      ** Generic server ns_couchdb_port terminating 
      ** Last message in was {#Port<0.20>,{exit_status,1}}
      ** When Server state == {state,#Port<0.20>,25548,
                                  {ns_couchdb,
       
      .................
                                  {ringbuffer,133,1024,
                                      {[{<<"{\"Kernel pid terminated\",application_controller,\"{application_terminated,ns_couchdb,shutdown}\"}">>,
                                         95}],
                                       [{<<"{removed_failing_handler,error_logger}">>,
                                         38}]}},
                                  undefined,undefined,[],0}
      ** Reason for termination ==
      ** {abnormal,1}
      

      The problem did not reproduce on subsequent runs
      ns_couchdb.log is unfortunately rotated out

      The code in logger_backend.erl indicates that there was some exception in error_logger, but no additional information is found anywhere in logs

                          try Module:log(Log1,HandlerConfig1)
                          catch C:R:S ->
                                  case logger:remove_handler(Id) of
                                      ok ->
                                          logger:internal_log(
                                            error,{removed_failing_handler,Id}),
                                          ?LOG_INTERNAL(
                                             debug,
                                             Log1,
                                             [{logger,removed_failing_handler},
                                              {handler,{Id,Module}},
                                              {log_event,Log1},
                                              {config,HandlerConfig1},
                                              {reason,{C,R,filter_stacktrace(S)}}]);
      

      At this point I don't think we can pinpoint what caused this. Creating a ticket as point of referral in case if similar situation is seen again.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            Balakumaran.Gopal Balakumaran Gopal
            artem Artem Stemkovski
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty