Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-47565

Unexpected server error during addition of node

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • 7.0.2
    • 7.0.1
    • ns_server
    • Centos 7 64 bit; CB EE 7.0.1-5942

    Description

      Steps of the test
      1. Create a 1 node kv cluster with node .215
      [2021-07-24 10:14:00,609] - [basetestcase:337] INFO - done initializing cluster

      2. Enable n2n encryption
      [2021-07-24 10:14:02,206] - [ntonencryptionBase:58] INFO - Output of node-to-node-encryption command is ['Turned on encryption for node: http://172.23.105.215:8091', 'SUCCESS: Switched node-to-node encryption on']

      3. Change the level to "strict"

      4, Now add .217 node to cluster
      2021-07-24 10:14:04,173] - [rest_client:1540] INFO - adding remote node @172.23.105.217:18091 to this cluster @172.23.105.215:18091

      5. Add .219 node to cluster
      [2021-07-24 10:14:29,902] - [rest_client:1540] INFO - adding remote node @172.23.105.219:18091 to this cluster @172.23.105.215:18091

      6. Add .237 node to cluster

      [2021-07-24 10:15:15,827] - [rest_client:1047] ERROR - POST https://172.23.105.215:18091/controller/addNode body: hostname=https%3A%2F%2F172.23.106.237%3A18091&user=Administrator&password=password headers: {'Content-Type': 'application/x-www-form-urlencoded', 'Authorization': 'Basic QWRtaW5pc3RyYXRvcjpwYXNzd29yZA==', 'Accept': '*/*'} error: 400 reason: unknown b'["Join completion call failed. Got HTTP status 500 from REST call post to https://172.23.106.237:18091/completeJoin. Body was: \\"[\\\\\\"Unexpected server error, request logged.\\\\\\"]\\""]' auth: Administrator:password
      [2021-07-24 10:15:15,886] - [rest_client:3863] INFO - Latest logs from UI on 172.23.105.215:
      [2021-07-24 10:15:15,887] - [rest_client:3864] ERROR - {'node': 'ns_1@172.23.105.215', 'type': 'info', 'code': 5, 'module': 'ns_cluster', 'tstamp': 1627146915823, 'shortText': 'message', 'text': 'Failed to add node 172.23.106.237:18091 to cluster. Join completion call failed. Got HTTP status 500 from REST call post to https://172.23.106.237:18091/completeJoin. Body was: "[\\"Unexpected server error, request logged.\\"]"', 'serverTime': '2021-07-24T10:15:15.823Z'}

      Observations
      on .215 debug.log

      [ns_server:warn,2021-07-24T10:15:15.647-07:00,ns_1@172.23.105.215:<0.5050.129>:leader_lease_acquire_worker:handle_exception:244]Failed to acquire lease from 'ns_1@172.23.106.237': {exit,
                                                           {noproc,
                                                            {gen_server,call,
                                                             [{leader_lease_agent,
                                                               'ns_1@172.23.106.237'},
                                                              {acquire_lease,
                                                               'ns_1@172.23.105.215',
                                                               <<"44fd6345e8b3aed8f010ce564e3e3e0e">>,
                                                               [{timeout,15000},
                                                                {period,15000}]},
                                                              infinity]}}}

      [ns_server:warn,2021-07-24T10:15:15.647-07:00,ns_1@172.23.105.215:users_replicator<0.329.0>:doc_replicator:loop:108]Remote server node {users_storage,'ns_1@172.23.106.237'} process down: noproc
      [chronicle:info,2021-07-24T10:15:15.669-07:00,ns_1@172.23.105.215:chronicle_proposer<0.5606.0>:chronicle_proposer:handle_down:1148]Observed agent {chronicle_agent,'ns_1@172.23.106.237'} on peer 'ns_1@172.23.106.237' go down with reason {{badmatch,undefined},[{chronicle_agent,get_config_revision,1,[{file,...},{...}]},{chronicle_agent,check_new_config,2,[{...}|...]},{chronicle_agent,post_append,3,[...]},{chronicle_agent,handle_install_snapshot,10,...},{gen_statem,loop_state_callback,...},{proc_lib,...}]}
      [cluster:debug,2021-07-24T10:15:15.701-07:00,ns_1@172.23.105.215:ns_cluster<0.242.0>:ns_cluster:post_json:821]Reply from [https,"172.23.106.237",18091,"/completeJoin"]:
      {error,rest_error,
             <<"Got HTTP status 500 from REST call post to https://172.23.106.237:18091/completeJoin. Body was: \"[\\\"Unexpected server error, request logged.\\\"]\"">>,
             {bad_status,500,<<"[\"Unexpected server error, request logged.\"]">>}}

      on .237 error.log

      [ns_server:error,2021-07-24T10:15:15.692-07:00,ns_1@172.23.106.237:<0.23125.63>:menelaus_util:reply_server_error:205]Server error during processing: ["web request failed",
                                       {path,"/completeJoin"},
                                       {method,'POST'},
                                       {type,exit},
                                       {what,
                                        {{{{failed_to_start_child,
                                            #{id => chronicle_rsm_sup,
                                              restart => permanent,
                                              shutdown => infinity,
                                              start =>
                                               {chronicle_rsm_sup,start_link,[]},
                                              type => supervisor},
                                            {{failed_to_start_child,
                                              #{id => kv,restart => permanent,
                                                start =>
                                                 {chronicle_single_rsm_sup,
                                                  start_link,
                                                  [kv,
                                                   <<"02490b0dbea1338d9efccd4646de5f91">>,
                                                   chronicle_kv,[]]},
                                                type => supervisor},
                                              {{shutdown,

       
      [ns_server:error,2021-07-24T10:15:18.779-07:00,ns_1@172.23.106.237:prometheus_cfg<0.23640.63>:prometheus_cfg:handle_info:520]Received exit from <0.23018.63> with reason {gen_event_shutdown, chronicle_compat_event_manager}. Stopping... [ns_server:error,2021-07-24T10:15:18.779-07:00,ns_1@172.23.106.237:prometheus_cfg<0.23640.63>:prometheus_cfg:terminate:529]Terminate: {gen_event_shutdown,chronicle_compat_event_manager} [user:critical,2021-07-24T10:15:18.797-07:00,ns_1@172.23.106.237:menelaus_sup<0.23412.63>:menelaus_web_sup:start_link:45]Couchbase Server has failed to start on web port 8091 on node 'ns_1@172.23.106.237'. Perhaps another process has taken port 8091 already? If so, please stop that process first before trying again.
      

      Retrying to add this node some time later worked fine. 

       

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            sumedh.basarkod Sumedh Basarkod (Inactive)
            sumedh.basarkod Sumedh Basarkod (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty