Details
Description
Summary:
Addition of node failed with unexpected server error.
Steps of the test
1. Create a 1 node cluster .215 with all services except cbas
2. Enable n2n encryption and change level to "strict"
2021-07-27 04:24:30 | INFO | MainProcess | test_thread | [ntonencryptionBase.change_cluster_encryption_cli] Output of setting-security command is ['SUCCESS: Security settings updated']
|
3. Create "default" bucket and load few docs, and
4. Rebalance-in .217 node
2021-07-27 04:25:09 | INFO | MainProcess | Cluster_Thread | [rest_client.add_node] adding remote node @172.23.105.217:18091 to this cluster @172.23.105.215:18091
|
2021-07-27 04:25:30 | ERROR | MainProcess | Cluster_Thread | [rest_client._http_request] POST https://172.23.105.215:18091/controller/addNode body: hostname=https%3A%2F%2F172.23.105.217%3A18091&user=Administrator&password=password headers: {'Content-Type': 'application/x-www-form-urlencoded', 'Authorization': 'Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA==', 'Accept': '*/*'} error: 500 reason: unknown b'["Unexpected server error, request logged."]' auth: Administrator:password
|
Observations
on .215 debug.log
[error_logger:error,2021-07-27T04:25:30.804-07:00,ns_1@172.23.105.215:ns_cluster<0.242.0>:ale_error_logger_handler:do_log:101]
|
=========================ERROR REPORT=========================
|
** Generic server ns_cluster terminating
|
** Last message in was {add_node_to_group,https,"172.23.105.217",18091,
|
{"Administrator","password"},
|
undefined,
|
[kv]}
|
** When Server state == {state}
|
** Reason for termination ==
|
** {{{error,wait_for_node_failed},
|
{gen_server,call,
|
[dist_manager,
|
{adjust_my_address,"172.23.105.215",false,
|
#Fun<ns_cluster.7.111409773>},
|
infinity]}},
|
[{gen_server,call,3,[{file,"gen_server.erl"},{line,223}]},
|
{ns_cluster,maybe_rename,2,[{file,"src/ns_cluster.erl"},{line,733}]},
|
{ns_cluster,do_change_address,2,[{file,"src/ns_cluster.erl"},{line,709}]},
|
{ns_cluster,do_add_node_allowed,6,
|
[{file,"src/ns_cluster.erl"},{line,781}]},
|
{ns_cluster,handle_call,3,[{file,"src/ns_cluster.erl"},{line,405}]},
|
{gen_server,try_handle_call,4,[{file,"gen_server.erl"},{line,661}]},
|
{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,690}]},
|
{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}
|
** Client <0.16052.0> stacktrace
|
** [{gen,do_call,4,[{file,"gen.erl"},{line,167}]},
|
{gen_server,call,3,[{file,"gen_server.erl"},{line,219}]},
|
{ns_cluster,add_node_to_group,6,[{file,"src/ns_cluster.erl"},{line,80}]},
|
{menelaus_web_cluster,do_handle_add_node,2,
|
[{file,"src/menelaus_web_cluster.erl"},{line,645}]},
|
{request_throttler,do_request,3,
|
[{file,"src/request_throttler.erl"},{line,58}]},
|
{menelaus_util,handle_request,2,
|
[{file,"src/menelaus_util.erl"},{line,216}]},
|
{mochiweb_http,headers,6,
|
[{file,"/home/couchbase/jenkins/workspace/couchbase-server-unix/couchdb/src/mochiweb/mochiweb_http.erl"},
|
{line,150}]},
|
{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]
|
|
=========================CRASH REPORT========================= crasher: initial call: ns_cluster:init/1 pid: <0.242.0> registered_name: ns_cluster exception exit: {{error,wait_for_node_failed}, {gen_server,call, [dist_manager, {adjust_my_address,"172.23.105.215",false, #Fun<ns_cluster.7.111409773>}, infinity]}} in function gen_server:call/3 (gen_server.erl, line 223) in call from ns_cluster:maybe_rename/2 (src/ns_cluster.erl, line 733) in call from ns_cluster:do_change_address/2 (src/ns_cluster.erl, line 709) in call from ns_cluster:do_add_node_allowed/6 (src/ns_cluster.erl, line 781) in call from ns_cluster:handle_call/3 (src/ns_cluster.erl, line 405) in call from gen_server:try_handle_call/4 (gen_server.erl, line 661) in call from gen_server:handle_msg/6 (gen_server.erl, line 690) ancestors: [ns_server_cluster_sup,root_sup,<0.139.0>] message_queue_len: 0 messages: [] links: [<0.243.0>,<0.206.0>] dictionary: [] trap_exit: false status: running heap_size: 4185 stack_size: 27 reductions: 25857 neighbours:
|
At the same time, few other rest calls seem to have failed. For example:
[ns_server:error,2021-07-27T04:25:30.902-07:00,ns_1@172.23.105.215:<0.16303.0>:menelaus_util:reply_server_error:205]Server error during processing: ["web request failed",
|
{path,"/settings/autoFailover"},
|
{method,'POST'},
|
{type,exit},
|
{what,
|
{noproc,
|
{gen_server,call,
|
[request_throttler,
|
{note_request,<0.16303.0>,rest},
|
infinity]}}},
|
{trace,
|
[{gen_server,call,3,
|
[{file,"gen_server.erl"},{line,223}]},
|
{request_throttler,request,3,
|
[{file,"src/request_throttler.erl"},
|
{line,34}]},
|
{menelaus_util,handle_request,2,
|
[{file,"src/menelaus_util.erl"},
|
{line,216}]},
|
{mochiweb_http,headers,6,
|
[{file,
|
"/home/couchbase/jenkins/workspace/couchbase-server-unix/couchdb/src/mochiweb/mochiweb_http.erl"},
|
{line,150}]},
|
{proc_lib,init_p_do_apply,3,
|
[{file,"proc_lib.erl"},{line,249}]}]}]
|
(Seems unrelated but backup service seems to have exited)
[user:info,2021-07-27T04:25:33.191-07:00,ns_1@172.23.105.215:<0.16523.0>:ns_log:crash_consumption_loop:63]Service 'backup' exited with status 1. Restarting. Messages:
|
2021-07-27T04:25:30.146-07:00 INFO (Main) Running node version backup-7.0.1-5961- with options: -http-port=8097 -grpc-port=9124 -https-port=18097 -cert-path=/opt/couchbase/var/lib/couchbase/config/memcached-cert.pem -key-path=/opt/couchbase/var/lib/couchbase/config/memcached-key.pem -ipv4=required -ipv6=optional -cbm=/opt/couchbase/bin/cbbackupmgr -node-uuid=6ba3e29c54af0c79834603e9a7596235 -public-address=172.23.105.215 -admin-port=8091 -log-file=none -log-level=debug -integrated-mode -integrated-mode-host=http://127.0.0.1:8091 -secure-integrated-mode-host=https://127.0.0.1:18091 -integrated-mode-user=@backup -default-collect-logs-path=/opt/couchbase/var/lib/couchbase/tmp -cbauth-host=127.0.0.1:8091
|
2021-07-27T04:25:30.146-07:00 INFO (Main) Initialized logger {"log level": "debug"}
|
2021-07-27T04:25:30.146-07:00 INFO (Main) Getting credentials
|
2021-07-27T04:25:30.151-07:00 DEBUG (Main) File limit info {"curr": 200000, "max": 200000, "err": null}
|
2021-07-27T04:25:30.160-07:00 ERROR (Main) Failed to run node {"err": "could not get cluster uuid: x509: certificate is valid for 127.0.0.1, not 172.23.105.215"}
|