Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-47605

error,wait_for_node_failed during addition of node

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • 7.0.2
    • 7.0.1
    • ns_server
    • None
    • Centos 7 64 bit; CB EE 7.0.1-5961

    Description

      Summary:
      Addition of node failed with unexpected server error.

      Steps of the test
      1. Create a 1 node cluster .215 with all services except cbas
      2. Enable n2n encryption and change level to "strict"

      2021-07-27 04:24:30 | INFO | MainProcess | test_thread | [ntonencryptionBase.change_cluster_encryption_cli] Output of setting-security command is ['SUCCESS: Security settings updated']

      3. Create "default" bucket and load few docs, and
      4. Rebalance-in .217 node

      2021-07-27 04:25:09 | INFO | MainProcess | Cluster_Thread | [rest_client.add_node] adding remote node @172.23.105.217:18091 to this cluster @172.23.105.215:18091

      2021-07-27 04:25:30 | ERROR | MainProcess | Cluster_Thread | [rest_client._http_request] POST https://172.23.105.215:18091/controller/addNode body: hostname=https%3A%2F%2F172.23.105.217%3A18091&user=Administrator&password=password headers: {'Content-Type': 'application/x-www-form-urlencoded', 'Authorization': 'Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA==', 'Accept': '*/*'} error: 500 reason: unknown b'["Unexpected server error, request logged."]' auth: Administrator:password

      Observations
      on .215 debug.log

      [error_logger:error,2021-07-27T04:25:30.804-07:00,ns_1@172.23.105.215:ns_cluster<0.242.0>:ale_error_logger_handler:do_log:101]
      =========================ERROR REPORT=========================
      ** Generic server ns_cluster terminating 
      ** Last message in was {add_node_to_group,https,"172.23.105.217",18091,
                                 {"Administrator","password"},
                                 undefined,
                                 [kv]}
      ** When Server state == {state}
      ** Reason for termination ==
      ** {{{error,wait_for_node_failed},
           {gen_server,call,
                       [dist_manager,
                        {adjust_my_address,"172.23.105.215",false,
                                           #Fun<ns_cluster.7.111409773>},
                        infinity]}},
          [{gen_server,call,3,[{file,"gen_server.erl"},{line,223}]},
           {ns_cluster,maybe_rename,2,[{file,"src/ns_cluster.erl"},{line,733}]},
           {ns_cluster,do_change_address,2,[{file,"src/ns_cluster.erl"},{line,709}]},
           {ns_cluster,do_add_node_allowed,6,
                       [{file,"src/ns_cluster.erl"},{line,781}]},
           {ns_cluster,handle_call,3,[{file,"src/ns_cluster.erl"},{line,405}]},
           {gen_server,try_handle_call,4,[{file,"gen_server.erl"},{line,661}]},
           {gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,690}]},
           {proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}
      ** Client <0.16052.0> stacktrace
      ** [{gen,do_call,4,[{file,"gen.erl"},{line,167}]},
          {gen_server,call,3,[{file,"gen_server.erl"},{line,219}]},
          {ns_cluster,add_node_to_group,6,[{file,"src/ns_cluster.erl"},{line,80}]},
          {menelaus_web_cluster,do_handle_add_node,2,
                                [{file,"src/menelaus_web_cluster.erl"},{line,645}]},
          {request_throttler,do_request,3,
                             [{file,"src/request_throttler.erl"},{line,58}]},
          {menelaus_util,handle_request,2,
                         [{file,"src/menelaus_util.erl"},{line,216}]},
          {mochiweb_http,headers,6,
                         [{file,"/home/couchbase/jenkins/workspace/couchbase-server-unix/couchdb/src/mochiweb/mochiweb_http.erl"},
                          {line,150}]},
          {proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]
      

       
      =========================CRASH REPORT========================= crasher: initial call: ns_cluster:init/1 pid: <0.242.0> registered_name: ns_cluster exception exit: {{error,wait_for_node_failed}, {gen_server,call, [dist_manager, {adjust_my_address,"172.23.105.215",false, #Fun<ns_cluster.7.111409773>}, infinity]}} in function gen_server:call/3 (gen_server.erl, line 223) in call from ns_cluster:maybe_rename/2 (src/ns_cluster.erl, line 733) in call from ns_cluster:do_change_address/2 (src/ns_cluster.erl, line 709) in call from ns_cluster:do_add_node_allowed/6 (src/ns_cluster.erl, line 781) in call from ns_cluster:handle_call/3 (src/ns_cluster.erl, line 405) in call from gen_server:try_handle_call/4 (gen_server.erl, line 661) in call from gen_server:handle_msg/6 (gen_server.erl, line 690) ancestors: [ns_server_cluster_sup,root_sup,<0.139.0>] message_queue_len: 0 messages: [] links: [<0.243.0>,<0.206.0>] dictionary: [] trap_exit: false status: running heap_size: 4185 stack_size: 27 reductions: 25857 neighbours:
      

      At the same time, few other rest calls seem to have failed. For example:

      [ns_server:error,2021-07-27T04:25:30.902-07:00,ns_1@172.23.105.215:<0.16303.0>:menelaus_util:reply_server_error:205]Server error during processing: ["web request failed",
                                       {path,"/settings/autoFailover"},
                                       {method,'POST'},
                                       {type,exit},
                                       {what,
                                        {noproc,
                                         {gen_server,call,
                                          [request_throttler,
                                           {note_request,<0.16303.0>,rest},
                                           infinity]}}},
                                       {trace,
                                        [{gen_server,call,3,
                                          [{file,"gen_server.erl"},{line,223}]},
                                         {request_throttler,request,3,
                                          [{file,"src/request_throttler.erl"},
                                           {line,34}]},
                                         {menelaus_util,handle_request,2,
                                          [{file,"src/menelaus_util.erl"},
                                           {line,216}]},
                                         {mochiweb_http,headers,6,
                                          [{file,
                                            "/home/couchbase/jenkins/workspace/couchbase-server-unix/couchdb/src/mochiweb/mochiweb_http.erl"},
                                           {line,150}]},
                                         {proc_lib,init_p_do_apply,3,
                                          [{file,"proc_lib.erl"},{line,249}]}]}]

      (Seems unrelated but backup service seems to have exited)

      [user:info,2021-07-27T04:25:33.191-07:00,ns_1@172.23.105.215:<0.16523.0>:ns_log:crash_consumption_loop:63]Service 'backup' exited with status 1. Restarting. Messages:
      2021-07-27T04:25:30.146-07:00 INFO (Main) Running node version backup-7.0.1-5961- with options: -http-port=8097 -grpc-port=9124 -https-port=18097 -cert-path=/opt/couchbase/var/lib/couchbase/config/memcached-cert.pem -key-path=/opt/couchbase/var/lib/couchbase/config/memcached-key.pem -ipv4=required -ipv6=optional -cbm=/opt/couchbase/bin/cbbackupmgr -node-uuid=6ba3e29c54af0c79834603e9a7596235 -public-address=172.23.105.215 -admin-port=8091 -log-file=none -log-level=debug -integrated-mode -integrated-mode-host=http://127.0.0.1:8091 -secure-integrated-mode-host=https://127.0.0.1:18091 -integrated-mode-user=@backup -default-collect-logs-path=/opt/couchbase/var/lib/couchbase/tmp -cbauth-host=127.0.0.1:8091
      2021-07-27T04:25:30.146-07:00 INFO (Main) Initialized logger {"log level": "debug"}
      2021-07-27T04:25:30.146-07:00 INFO (Main) Getting credentials
      2021-07-27T04:25:30.151-07:00 DEBUG (Main) File limit info {"curr": 200000, "max": 200000, "err": null}
      2021-07-27T04:25:30.160-07:00 ERROR (Main) Failed to run node {"err": "could not get cluster uuid: x509: certificate is valid for 127.0.0.1, not 172.23.105.215"}

       

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              Abhijeeth.Nuthan Abhijeeth Nuthan
              sumedh.basarkod Sumedh Basarkod (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Gerrit Reviews

                  There are no open Gerrit changes

                  PagerDuty