Uploaded image for project: 'Couchbase Server'
  1. Couchbase Server
  2. MB-47925

[System Test] cbq-engine crashes multiple times during startup

    XMLWordPrintable

Details

    Description

      Build : 7.1.0-1138
      Test : -test tests/2i/cheshirecat/test_idx_clusterops_cheshire_cat_recovery.yml -scope tests/2i/cheshirecat/scope_idx_cheshire_cat_dgm.yml
      Scale : 3
      Iteration : 1st

      Just after the cluster is formed and query nodes 172.23.97.227 and 172.23.97.236 are added to the cluster, the cbq-engine process restarts multiple times on both the nodes.

      _time=2021-08-12T15:05:58.236-07:00 _level=INFO _msg=Starting audit worker 5
      _time=2021-08-12T15:05:58.236-07:00 _level=INFO _msg=Starting audit worker 7
      _time=2021-08-12T15:05:58.237-07:00 _level=INFO _msg=Starting audit settings worker 1.
      _time=2021-08-12T15:05:58.237-07:00 _level=INFO _msg= Certificates have been refreshed by ns server
      _time=2021-08-12T15:05:58.237-07:00 _level=INFO _msg=HttpEndpoint: ListenTLS Address - [::]:18093
      _time=2021-08-12T15:05:58.237-07:00 _level=INFO _msg=HttpEndpoint: ListenTLS Address - 0.0.0.0:18093
      _time=2021-08-12T15:05:58.238-07:00 _level=INFO _msg=HttpEndpoint: Listen Address - [::]:8093
      _time=2021-08-12T15:05:58.238-07:00 _level=INFO _msg=HttpEndpoint: Listen Address - 0.0.0.0:8093
      _time=2021-08-12T15:05:58.238-07:00 _level=ERROR _msg=Updating node-to-node encryption level: {EncryptData:false DisableNonSSLPorts:false}
      _time=2021-08-12T15:05:58.238-07:00 _level=INFO _msg=Failed to start service: listen tcp6 :8093: bind: address already in use
      _time=2021-08-12T15:05:58.238-07:00 _level=ERROR _msg=cbq-engine (HTTP_ADDR :8093) exiting with error: Failed to start service: listen tcp4 :8093: bind: address already in use
      _time=2021-08-12T15:06:03.481-07:00 _level=INFO _msg=Current nofiles rlimit: 200000 (max: 200000)
      _time=2021-08-12T15:06:03.485-07:00 _level=INFO _msg= Initialization of cbauth succeeded
      _time=2021-08-12T15:06:03.533-07:00 _level=INFO _msg=New store created with url http://127.0.0.1:8091
      _time=2021-08-12T15:06:03.567-07:00 _level=INFO _msg= keyspace N1QL_SYSTEM_BUCKET not found No bucket named N1QL_SYSTEM_BUCKET
      _time=2021-08-12T15:06:03.588-07:00 _level=INFO _msg= Retrying bucket N1QL_SYSTEM_BUCKET
      _time=2021-08-12T15:06:03.589-07:00 _level=WARN _msg=Changing GC percent from 0 to 200
      _time=2021-08-12T15:06:03.594-07:00 _level=INFO _msg=New settings received: {"timeout":0,"numatrs":1024,"n1ql-feat-ctrl":76,"query.settings.curl_whitelist":{"all_access":false,"allowed_urls":[],"disallowed_urls":[]},"max-parallelism":1,"query.settings.tmp_space_dir":"/opt/couchbase/var/lib/couchbase/tmp","completed-limit":4000,"pipeline-batch":16,"prepared-limit":16384,"pipeline-cap":512,"memory-quota":0,"cleanupwindow":"60s","use-cbo":true,"scan-cap":512,"query.settings.tmp_space_size":5120,"completed-threshold":1000,"loglevel":"info","cleanuplostattempts":true,"cleanupclientattempts":true,"txtimeout":"0ms"}
      _time=2021-08-12T15:06:03.594-07:00 _level=INFO _msg=Settings notifier from metakv
      _time=2021-08-12T15:06:03.594-07:00 _level=INFO _msg=New Value for curl allowedlist <ud>map[all_access:false allowed_urls:[] disallowed_urls:[]]</ud>
      2021-08-12T15:06:03.600-07:00 [Info] GSIC - Setting config map[query_tmpspace_dir:/opt/couchbase/var/lib/couchbase/tmp query_tmpspace_limit:5120]
      _time=2021-08-12T15:06:03.600-07:00 _level=INFO _msg= GSI indexer settings have been updated map[query_tmpspace_dir:/opt/couchbase/var/lib/couchbase/tmp query_tmpspace_limit:5120]
      _time=2021-08-12T15:06:03.601-07:00 _level=INFO _msg= n1fty indexer settings have been updated map[query_tmpspace_dir:/opt/couchbase/var/lib/couchbase/tmp query_tmpspace_limit:5120]
      _time=2021-08-12T15:06:03.601-07:00 _level=INFO _msg=Query Configuration changed for scan-cap. New value is 512
      _time=2021-08-12T15:06:03.601-07:00 _level=INFO _msg=Query Configuration changed for completed-threshold. New value is 1000
      _time=2021-08-12T15:06:03.601-07:00 _level=INFO _msg=Query Configuration changed for cleanupclientattempts. New value is true
      _time=2021-08-12T15:06:03.601-07:00 _level=INFO _msg=Query Configuration changed for max-parallelism. New value is 1
      _time=2021-08-12T15:06:03.601-07:00 _level=INFO _msg=Query Configuration changed for prepared-limit. New value is 16384
      _time=2021-08-12T15:06:03.601-07:00 _level=INFO _msg=Query Configuration changed for memory-quota. New value is 0
      _time=2021-08-12T15:06:03.601-07:00 _level=INFO _msg=Query Configuration changed for n1ql-feat-ctrl. New value is 76
      _time=2021-08-12T15:06:03.601-07:00 _level=INFO _msg=Query Configuration changed for pipeline-batch. New value is 16
      _time=2021-08-12T15:06:03.601-07:00 _level=INFO _msg=Query Configuration changed for loglevel. New value is info
      _time=2021-08-12T15:06:03.601-07:00 _level=INFO _msg=Query Configuration changed for cleanuplostattempts. New value is true
      _time=2021-08-12T15:06:03.601-07:00 _level=INFO _msg=Query Configuration changed for completed-limit. New value is 4000
      _time=2021-08-12T15:06:03.601-07:00 _level=INFO _msg=Query Configuration changed for use-cbo. New value is true
      _time=2021-08-12T15:06:03.601-07:00 _level=INFO _msg=Query Configuration changed for timeout. New value is 0
      _time=2021-08-12T15:06:03.601-07:00 _level=INFO _msg=Query Configuration changed for numatrs. New value is 1024
      _time=2021-08-12T15:06:03.601-07:00 _level=INFO _msg=Query Configuration changed for pipeline-cap. New value is 512
      _time=2021-08-12T15:06:03.601-07:00 _level=INFO _msg=Query Configuration changed for cleanupwindow. New value is 60s
      _time=2021-08-12T15:06:03.601-07:00 _level=INFO _msg=Query Configuration changed for txtimeout. New value is 0ms
      2021/08/12 15:06:03 audit: created new audit service
      _time=2021-08-12T15:06:03.619-07:00 _level=INFO _msg=cbq-engine started version=7.1.0-N1QL datastore=http://127.0.0.1:8091 max-concurrency=8 loglevel=INFO servicers=32 plus-servicers=128 scan-cap=512 pipeline-cap=512 pipeline-batch=16 request-cap=256 request-size-cap=67108864 max-index-api=4 max-parallelism=1 n1ql-feat-ctrl=76 use-cbo=true timeout=0s txtimeout=0s gc-percent=200
      

      From the error logs :

      172.23.97.227 : crash
      [user:info,2021-08-12T15:05:58.246-07:00,ns_1@172.23.97.227:<0.7788.0>:ns_log:crash_consumption_loop:63]Service 'query' exited with status 1. Restarting. Messages:
      [user:info,2021-08-12T15:06:03.637-07:00,ns_1@172.23.97.227:<0.7788.0>:ns_log:crash_consumption_loop:63]Service 'query' exited with status 1. Restarting. Messages:
       
      172.23.97.236 : crash
      [user:info,2021-08-12T15:05:58.243-07:00,ns_1@172.23.97.236:<0.9942.0>:ns_log:crash_consumption_loop:63]Service 'query' exited with status 1. Restarting. Messages:
      [user:info,2021-08-12T15:06:03.625-07:00,ns_1@172.23.97.236:<0.9942.0>:ns_log:crash_consumption_loop:63]Service 'query' exited with status 1. Restarting. Messages:
      [user:info,2021-08-12T15:06:09.174-07:00,ns_1@172.23.97.236:<0.9942.0>:ns_log:crash_consumption_loop:63]Service 'query' exited with status 1. Restarting. Messages:
      [user:info,2021-08-12T15:06:14.580-07:00,ns_1@172.23.97.236:<0.9942.0>:ns_log:crash_consumption_loop:63]Service 'query' exited with status 1. Restarting. Messages:
      [user:info,2021-08-12T15:06:20.046-07:00,ns_1@172.23.97.236:<0.9942.0>:ns_log:crash_consumption_loop:63]Service 'query' exited with status 1. Restarting. Messages:
      [user:info,2021-08-12T15:06:25.457-07:00,ns_1@172.23.97.236:<0.9942.0>:ns_log:crash_consumption_loop:63]Service 'query' exited with status 1. Restarting. Messages:
      [user:info,2021-08-12T15:06:30.868-07:00,ns_1@172.23.97.236:<0.9942.0>:ns_log:crash_consumption_loop:63]Service 'query' exited with status 1. Restarting. Messages:
      [user:info,2021-08-12T15:06:36.262-07:00,ns_1@172.23.97.236:<0.9942.0>:ns_log:crash_consumption_loop:63]Service 'query' exited with status 1. Restarting. Messages:
      [user:info,2021-08-12T15:06:41.677-07:00,ns_1@172.23.97.236:<0.9942.0>:ns_log:crash_consumption_loop:63]Service 'query' exited with status 1. Restarting. Messages:
      [user:info,2021-08-12T15:06:47.087-07:00,ns_1@172.23.97.236:<0.9942.0>:ns_log:crash_consumption_loop:63]Service 'query' exited with status 1. Restarting. Messages:
      

      This behavior is seen in multiple tests. This was last run on Neo with build 7.1.0-1085, and the issue was not seen then. So this is a regression.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            isha Isha Kandaswamy (Inactive)
            mihir.kamdar Mihir Kamdar (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Gerrit Reviews

                There are no open Gerrit changes

                PagerDuty