Details
-
Bug
-
Resolution: Duplicate
-
Critical
-
Cheshire-Cat
-
Untriaged
-
1
-
Unknown
Description
Build: 7.0.0-4766
Componet Clusterops test: -test tests/fts/cheshire-cat/test_fts_clusterops_cheshire_cat_coll_crud.yml -scope tests/fts/cheshire-cat/scope_fts_cheshire_cat.yml
Test Cycle: 1
Test with index create/drop loop.
Test steps:
- there are 5 buckets, out of which 20 static fts indexes are created on collections of 3 buckets. Mutations are going on these collections
- For the collections on other 2 buckets, we create and drop indexes and no mutations are going on these collections.
- Continuously run queries on the indexes of collections of bucket1 and bucket2
- wait for 15 mins
- kill cbft on 172.23.97.217 and wait for 15 mins
- add fts node 172.23.107.4 and start rebalance and wait for 15 mins which failed with :
MB-45254 - Test continued. Kill cbft on 172.23.107.5 and wait for 15 mins
- Rebalance to remove kv node: 172.23.97.232
Seeing below on node 172.23.107.2 and 172.23.97.217
info.log
|
[ns_server:error,2021-03-24T17:18:47.047-07:00,ns_1@172.23.107.2:service_status_keeper_worker<0.8117.0>:rest_utils:get_json:63]Request to (fts) api/nsstatus with headers [] failed: {error,timeout} |
[ns_server:error,2021-03-24T17:18:47.048-07:00,ns_1@172.23.107.2:service_status_keeper-fts<0.8122.0>:service_status_keeper:handle_cast:109]Service service_fts returned incorrect status |
[ns_server:info,2021-03-24T17:18:52.749-07:00,ns_1@172.23.107.2:<0.22630.147>:diag_handler:log_all_dcp_stats:197]logging dcp stats |
[ns_server:info,2021-03-24T17:18:53.409-07:00,ns_1@172.23.107.2:<0.22630.147>:diag_handler:log_all_dcp_stats:201]end of logging dcp stats |
[ns_server:warn,2021-03-24T17:18:53.802-07:00,ns_1@172.23.107.2:timer_lag_recorder<0.7871.0>:timer_lag_recorder:report_missed_msgs:53]Skipped 3 'check_time' messages |
[ns_server:info,2021-03-24T17:18:55.017-07:00,ns_1@172.23.107.2:ns_config_rep<0.22880.147>:ns_config_rep:pull_one_node:359]Pulling config from: 'ns_1@172.23.107.3' |
[ns_server:error,2021-03-24T17:19:03.809-07:00,ns_1@172.23.107.2:service_status_keeper_worker<0.8117.0>:rest_utils:get_json:63]Request to (fts) api/nsstatus with headers [] failed: {error,timeout} |
[ns_server:warn,2021-03-24T17:19:20.737-07:00,ns_1@172.23.107.2:timer_lag_recorder<0.7871.0>:timer_lag_recorder:report_missed_msgs:53]Skipped 1 'check_time' messages |
[ns_server:info,2021-03-24T17:19:24.856-07:00,ns_1@172.23.107.2:ns_config_rep<0.22880.147>:ns_config_rep:pull_one_node:359]Pulling config from: 'ns_1@172.23.97.235' |
[ns_server:error,2021-03-24T17:19:24.933-07:00,ns_1@172.23.107.2:service_status_keeper-fts<0.8122.0>:service_status_keeper:handle_cast:109]Service service_fts returned incorrect status |
[ns_server:error,2021-03-24T17:19:28.002-07:00,ns_1@172.23.107.2:service_agent-fts<0.8865.0>:service_agent:handle_info:291]Lost json rpc connection for service fts, reason shutdown. Terminating. |
[ns_server:error,2021-03-24T17:19:28.014-07:00,ns_1@172.23.107.2:service_agent-fts<0.8865.0>:service_agent:terminate:312]Terminating abnormally |
[user:info,2021-03-24T17:19:28.018-07:00,ns_1@172.23.107.2:<0.7872.0>:ns_log:crash_consumption_loop:69]Service 'fts' exited with status 137. Restarting. Messages: |
in fts.log, we see while restarting fts its been seen "CBAuth database is stale" and crashed with "Service 'fts' exited with status 1"
2021-03-24T17:19:21.359-07:00 [INFO] app_herder: persister progress, indexes: 367, waiting: 82 |
2021-03-24T17:19:34.662-07:00 [INFO] main: /opt/couchbase/bin/cbft started (v0.6.0/5.5.0) |
2021-03-24T17:19:34.662-07:00 [INFO] main: file descriptor limit current: 200000 max: 200000 |
2021-03-24T17:19:34.663-07:00 [INFO] -authType="cbauth" |
2021-03-24T17:19:34.663-07:00 [INFO] -bindGrpc="172.23.107.2:9130,0.0.0.0:9130" |
2021-03-24T17:19:34.663-07:00 [INFO] -bindGrpcSsl="172.23.107.2:19130,0.0.0.0:19130" |
2021-03-24T17:19:34.663-07:00 [INFO] -bindHttp="172.23.107.2:8094,0.0.0.0:8094" |
2021-03-24T17:19:34.663-07:00 [INFO] -bindHttps=":18094" |
2021-03-24T17:19:34.663-07:00 [INFO] -cfgConnect="metakv" |
2021-03-24T17:19:34.663-07:00 [INFO] -container="" |
2021-03-24T17:19:34.663-07:00 [INFO] -dataDir="/data/@fts" |
2021-03-24T17:19:34.663-07:00 [INFO] -extras="172.23.107.2:8091" |
2021-03-24T17:19:34.663-07:00 [INFO] -help="false" |
2021-03-24T17:19:34.663-07:00 [INFO] -options="startCheckServer=skip,slowQueryLogTimeout=5s,defaultMaxPartitionsPerPIndex=171,bleveMaxResultWindow=10000,failoverAssignAllPrimaries=false,hideUI=true,cbaudit=true,ipv6=false,ftsMemoryQuota=17145856000,maxReplicasAllowed=3,bucketTypesAllowed=membase:ephemeral,http2=true,vbuckets=1024" |
2021-03-24T17:19:34.663-07:00 [INFO] -register="wanted" |
2021-03-24T17:19:34.663-07:00 [INFO] -server="http://127.0.0.1:8091" |
2021-03-24T17:19:34.663-07:00 [INFO] -staticDir="static" |
2021-03-24T17:19:34.663-07:00 [INFO] -staticETag="" |
2021-03-24T17:19:34.663-07:00 [INFO] -tags="feed,janitor,pindex,queryer,cbauth_service" |
2021-03-24T17:19:34.663-07:00 [INFO] -tlsCertFile="/opt/couchbase/var/lib/couchbase/config/memcached-cert.pem" |
2021-03-24T17:19:34.663-07:00 [INFO] -tlsKeyFile="/opt/couchbase/var/lib/couchbase/config/memcached-key.pem" |
2021-03-24T17:19:34.663-07:00 [INFO] -uuid="a8d092168910204cb96eb762ec06923a" |
2021-03-24T17:19:34.663-07:00 [INFO] -version="false" |
2021-03-24T17:19:34.663-07:00 [INFO] -weight="1" |
2021-03-24T17:19:34.663-07:00 [INFO] GOMAXPROCS=8 |
2021-03-24T17:19:34.663-07:00 [INFO] main: registered bleve stores |
2021-03-24T17:19:34.663-07:00 [INFO] gtreap |
2021-03-24T17:19:34.663-07:00 [INFO] boltdb |
2021-03-24T17:19:34.663-07:00 [INFO] metrics |
2021-03-24T17:19:34.663-07:00 [INFO] goleveldb |
2021-03-24T17:19:34.663-07:00 [INFO] moss |
2021-03-24T17:19:34.663-07:00 [INFO] main: curr dir: "/opt/couchbase/var/lib/couchbase" |
2021-03-24T17:19:34.663-07:00 [INFO] main: data dir: "/data/@fts" |
2021-03-24T17:20:34.653-07:00 [INFO] cbauth: key: fts/remoteClients registered for tls config updates |
2021-03-24T17:20:34.654-07:00 [INFO] app_herder: memQuota: 17145856000, appQuota: 17145856000, indexQuota: 12859392000, queryQuota: 15431270400 |
2021-03-24T17:20:34.654-07:00 [INFO] main: custom jsoniter json implementation enabled |
2021-03-24T17:20:34.654-07:00 [WARN] main: failed to start audit service with err: CBAuth database is stale. Was never updated yet. -- main.mainStart() at main.go:474 |
2021-03-24T17:20:34.654-07:00 [WARN] cfg_metakv: RunObserveChildren, err: Get http://127.0.0.1:8091/_metakv/fts/cbgt/cfg/?feed=continuous: CBAuth database is stale. Was never updated yet. -- cbgt.NewCfgMetaKv.func1() at cfg_metakv.go:155 |
2021-03-24T17:20:34.654-07:00 [INFO] cbauth: received security change notification, code: 3 |
2021-03-24T17:20:34.654-07:00 [WARN] cbauth: GetTLSConfig failed, err: CBAuth database is stale. Was never updated yet. -- cbgt.(*SecurityContext).refreshConfig() at cbauth.go:135 |
2021-03-24T17:20:34.655-07:00 [INFO] cbauth: key: fts/grpc-ssl registered for tls config updates |
2021-03-24T17:20:34.655-07:00 [INFO] init_grpc: GrpcServer Started at "0.0.0.0:9130", proto: "tcp4" |
2021-03-24T17:20:34.658-07:00 [INFO] main: ctl starting, dryRun: false, waitForMemberNodes: 30, verbose: 3, maxConcurrentPartitionMovesPerNode: 1 |
2021-03-24T17:20:34.658-07:00 [WARN] error in appending certificates to transport's tlsConfig -- cbft.setupHttp2ClientLOCKED() at pindex_bleve.go:2557 |
2021-03-24T17:20:34.658-07:00 [INFO] ctl: run, CurrentMemberNodes err Get http://127.0.0.1:8091/_metakv/fts/cbgt/cfg/nodeDefs-wanted/: CBAuth database is stale. Was never updated yet. |
2021-03-24T17:20:34.658-07:00 [FATA] main: ctl.StartCtl, err: Get http://127.0.0.1:8091/_metakv/fts/cbgt/cfg/nodeDefs-wanted/: CBAuth database is stale. Was never updated yet. -- main.main() at main.go:234 |
2021-03-24T17:20:35.005-07:00 [INFO] main: /opt/couchbase/bin/cbft started (v0.6.0/5.5.0) |
2021-03-24T17:20:35.005-07:00 [INFO] main: file descriptor limit current: 200000 max: 200000 |
2021-03-24T17:20:35.005-07:00 [INFO] -authType="cbauth" |
2021-03-24T17:20:35.005-07:00 [INFO] -bindGrpc="172.23.107.2:9130,0.0.0.0:9130" |
2021-03-24T17:20:35.005-07:00 [INFO] -bindGrpcSsl="172.23.107.2:19130,0.0.0.0:19130" |
|
cbcollect logs:
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1616643988/collectinfo-2021-03-25T034631-ns_1%40172.23.107.2.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1616643988/collectinfo-2021-03-25T034631-ns_1%40172.23.107.3.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1616643988/collectinfo-2021-03-25T034631-ns_1%40172.23.107.4.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1616643988/collectinfo-2021-03-25T034631-ns_1%40172.23.107.5.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1616643988/collectinfo-2021-03-25T034631-ns_1%40172.23.97.215.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1616643988/collectinfo-2021-03-25T034631-ns_1%40172.23.97.217.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1616643988/collectinfo-2021-03-25T034631-ns_1%40172.23.97.227.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1616643988/collectinfo-2021-03-25T034631-ns_1%40172.23.97.232.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1616643988/collectinfo-2021-03-25T034631-ns_1%40172.23.97.235.zip
url : https://cb-jira.s3.us-east-2.amazonaws.com/logs/systestmon-1616643988/collectinfo-2021-03-25T034631-ns_1%40172.23.97.236.zip
########## Cluster config ##################
|
###### fts : 5 ===== > [172.23.107.2:8091 172.23.107.3:8091 172.23.107.5:8091 172.23.97.216:8091 172.23.97.217:8091] ########### |
###### kv : 4 ===== > [172.23.97.215:8091 172.23.97.232:8091 172.23.97.235:8091 172.23.97.237:8091] ########### |
###### n1ql : 2 ===== > [172.23.97.227:8091 172.23.97.236:8091] ########### |
Attachments
Issue Links
- is duplicated by
-
MB-45168 [System Test]: Service 'ns_server' exited with status 137
- Closed