Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-47947

[System Test] Node addition failed due to late arriving message

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • 7.1.0
    • 7.1.0
    • ns_server

    Description

      Build - 7.1.0 - 1138
      Test -

      -test tests/integration/cheshirecat/test_cheshirecat_kv_gsi_coll_xdcr_backup_sgw_fts_itemct_txns_eventing_cbas_scale3.yml -scope tests/integration/cheshirecat/scope_cheshirecat_with_backup.yml
      

      Scale - 3
      Day - 2
      Cycle - 1

      Test step -

      [2021-08-13T07:03:08-07:00, sequoiatools/couchbase-cli:7.0:6ee147] server-add -c 172.23.108.103:8091 --server-add https://172.23.96.148 -u Administrator -p password --server-add-username Administrator --server-add-password password --services fts
      →  
       
      Error occurred on container - sequoiatools/couchbase-cli:7.0:[server-add -c 172.23.108.103:8091 --server-add https://172.23.96.148 -u Administrator -p password --server-add-username Administrator --server-add-password password --services fts]
       
      docker logs 6ee147
      docker start 6ee147
       
      sWARNING: couchbase-cli version 7.0.0-3653-enterprise does not match couchbase server version 7.1.0-1138-enterprise
      tERROR: Join completion call failed. Failed to start ns_server cluster processes back. Logs might have more details.
      

      On 172.23.108.103
      debug.log

      [ns_server:warn,2021-08-13T07:03:20.229-07:00,ns_1@172.23.108.103:<0.5551.476>:leader_lease_acquire_worker:handle_exception:244]Failed to acquire lease from 'ns_1@172.23.96.148': {exit,
                                                          {noproc,
                                                           {gen_server,call,
                                                            [{leader_lease_agent,
                                                              'ns_1@172.23.96.148'},
                                                             {acquire_lease,
                                                              'ns_1@172.23.108.103',
                                                              <<"b4453a250ca60a648b9795782212885e">>,
                                                              [{timeout,15000},
                                                               {period,15000}]},
                                                             infinity]}}}
      [ns_server:warn,2021-08-13T07:03:21.249-07:00,ns_1@172.23.108.103:<0.5551.476>:leader_lease_acquire_worker:handle_exception:244]Failed to acquire lease from 'ns_1@172.23.96.148': {exit,
                                                          {noproc,
                                                           {gen_server,call,
                                                            [{leader_lease_agent,
                                                              'ns_1@172.23.96.148'},
                                                             {acquire_lease,
                                                              'ns_1@172.23.108.103',
                                                              <<"b4453a250ca60a648b9795782212885e">>,
                                                              [{timeout,15000},
                                                               {period,15000}]},
                                                             infinity]}}}
      [ns_server:info,2021-08-13T07:03:21.786-07:00,ns_1@172.23.108.103:<0.4612.476>:compaction_daemon:maybe_compact_vbucket:743]Compaction of <<"default/245">> has finished with ok
      [ns_server:info,2021-08-13T07:03:21.789-07:00,ns_1@172.23.108.103:<0.8244.476>:compaction_daemon:maybe_compact_vbucket:740]Compacting 'default/246', DataSize = 14144543, FileSize = 35197027, Options = {1628859173,
                                                                                     189117,
                                                                                     false}
      [ns_server:warn,2021-08-13T07:03:22.028-07:00,ns_1@172.23.108.103:users_replicator<0.18659.0>:doc_replicator:loop:108]Remote server node {users_storage,'ns_1@172.23.96.148'} process down: shutdown
      [cluster:debug,2021-08-13T07:03:22.042-07:00,ns_1@172.23.108.103:ns_cluster<0.242.0>:ns_cluster:post_json:821]Reply from [https,"172.23.96.148",18091,"/completeJoin"]:
      {client_error,[<<"Failed to start ns_server cluster processes back. Logs might have more details.">>]}
      [cluster:error,2021-08-13T07:03:22.043-07:00,ns_1@172.23.108.103:ns_cluster<0.242.0>:ns_cluster:node_add_transaction_finish:1128]Add transaction of 'ns_1@172.23.96.148' failed because of {error,
                                                                 complete_join,
                                                                 <<"Join completion call failed. Failed to start ns_server cluster processes back. Logs might have more details.">>}
      

      On 172.23.96.148
      error.log

      [ns_server:error,2021-08-13T04:00:15.865-07:00,ns_1@172.23.96.148:prometheus_cfg<0.19204.0>:prometheus_cfg:terminate:529]Terminate: shutdown
      [ns_server:error,2021-08-13T07:03:13.589-07:00,ns_1@172.23.96.148:prometheus_cfg<0.9485.286>:prometheus_cfg:terminate:529]Terminate: shutdown
      [ns_server:error,2021-08-13T07:03:21.973-07:00,ns_1@172.23.96.148:prometheus_cfg<0.7685.298>:prometheus_cfg:terminate:529]Terminate: shutdown
      [ns_server:error,2021-08-13T07:03:21.974-07:00,ns_1@172.23.96.148:<0.4818.298>:prometheus:post_async:188]Prometheus http request failed:
      URL: http://127.0.0.1:9123/-/quit
      Body: 
      Reason: {failed_connect,[{to_address,{"127.0.0.1",9123}},
                               {inet,[inet],econnrefused}]}
      [ns_server:error,2021-08-13T07:03:21.974-07:00,ns_1@172.23.96.148:prometheus_cfg<0.7685.298>:prometheus_cfg:terminate_prometheus:691]Failed to terminate Prometheus gracefully, trying to kill it...
      [cluster:error,2021-08-13T07:03:22.025-07:00,ns_1@172.23.96.148:ns_cluster<0.242.0>:ns_cluster:perform_actual_join:1437]Failed to join cluster because of: {error,
                                          {shutdown,
                                           {failed_to_start_child,ns_server_sup,
                                            {shutdown,
                                             {failed_to_start_child,
                                              memcached_passwords,
                                              {noproc,
                                               {gen_server,call,
                                                [memcached_refresh,
                                                 {apply_to_file,
                                                  "/opt/couchbase/var/lib/couchbase/isasl.pw.tmp",
                                                  "/opt/couchbase/var/lib/couchbase/isasl.pw"}]}}}}}}}
      [ns_server:error,2021-08-13T07:03:25.832-07:00,ns_1@172.23.96.148:prometheus_cfg<0.4644.298>:prometheus_cfg:terminate:529]Terminate: shutdown
      [ns_server:error,2021-08-13T07:03:29.366-07:00,ns_1@172.23.96.148:<0.6488.298>:prometheus:post_async:188]Prometheus http request failed:
      URL: http://127.0.0.1:9123/api/v1/query
      Body: query=%7Bname%3D~%60kv_curr_items%7Ckv_curr_items_tot%7Ckv_mem_used_bytes%7Ccouch_docs_actual_disk_size%7Ccouch_views_actual_disk_size%7Ckv_ep_db_data_size_bytes%7Ckv_ep_bg_fetched%60%7D+or+kv_vb_curr_items%7Bstate%3D%27replica%27%7D+or+kv_vb_num_non_resident%7Bstate%3D%27active%27%7D+or+label_replace%28sum+by+%28bucket%2C+name%29+%28irate%28kv_ops%7Bop%3D%60get%60%7D%5B1m%5D%29%29%2C+%60name%60%2C%60cmd_get%60%2C+%60%60%2C+%60%60%29+or+label_replace%28irate%28kv_ops%7Bop%3D%60get%60%2Cresult%3D%60hit%60%7D%5B1m%5D%29%2C%60name%60%2C%60get_hits%60%2C%60%60%2C%60%60%29+or+label_replace%28sum+by+%28bucket%29+%28irate%28kv_cmd_lookup%5B1m%5D%29+or+irate%28kv_ops%7Bop%3D~%60set%7Cincr%7Cdecr%7Cdelete%7Cdel_meta%7Cget_meta%7Cset_meta%7Cset_ret_meta%7Cdel_ret_meta%60%7D%5B1m%5D%29%29%2C+%60name%60%2C+%60ops%60%2C+%60%60%2C+%60%60%29+or+sum+by+%28bucket%2C+name%29+%28%7Bname%3D~%60index_data_size%7Cindex_disk_size%7Ccouch_spatial_data_size%7Ccouch_spatial_disk_size%7Ccouch_views_data_size%60%7D%29&timeout=5s
      Reason: {failed_connect,[{to_address,{"127.0.0.1",9123}},
                               {inet,[inet],econnrefused}]}
      

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            steve.watanabe Steve Watanabe
            sujay.gad Sujay Gad
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty