Details
-
Bug
-
Resolution: Fixed
-
Critical
-
6.0.0
-
centos2 cluster
-
Untriaged
-
-
Yes
Description
Build : 6.0.0-1693 (RC4) (was also seen in RC3, but couldn't confirm due to lack of resources)
Test : -test tests/fts/test_fts_alice_component.yml -scope tests/fts/scope_component_fts.yml
Scale : 1
The FTS system test is showing failures in rebalance operations due to lack of disk space on some nodes. This issue was seen in RC3 as well, but could not be investigated further due to lack of resources.
The cluster is live and available for debugging : http://172.23.96.206:8091
This issue was not seen on RC2. This could also be related to MB-31405.
Log Excerpts
[root@localhost logs]# cat error.log | grep -i "rebalance exited" -a5 -b5
|
7647- [{file,
|
7696- "/home/couchbase/jenkins/workspace/couchbase-server-unix/couchdb/src/lhttpc/lhttpc_client.erl"},
|
7836- {line,92}]}]}}
|
7893-[ns_server:error,2018-10-15T22:20:19.781-07:00,ns_1@172.23.96.206:service_rebalancer-fts<0.3722.56>:service_agent:process_bad_results:810]Service call unset_rebalancer (service fts) failed on some nodes:
|
8097-[{'ns_1@172.23.96.206',nack}]
|
8127:[user:error,2018-10-15T22:20:19.782-07:00,ns_1@172.23.96.206:<0.22786.0>:ns_orchestrator:do_log_rebalance_completion:1117]Rebalance exited with reason {service_rebalance_failed,fts,
|
8309- {lost_connection,shutdown}}
|
8370-[ns_server:error,2018-10-15T22:20:20.225-07:00,ns_1@172.23.96.206:service_stats_collector-fts<0.8908.0>:rest_utils:get_json_local:63]Request to (fts) api/nsstats failed: {error,
|
8548- {econnrefused,
|
8601- [{lhttpc_client,send_request,1,
|
8672- [{file,
|
--
|
225890- [{file,
|
225939- "/home/couchbase/jenkins/workspace/couchbase-server-unix/couchdb/src/lhttpc/lhttpc_client.erl"},
|
226079- {line,92}]}]}}
|
[root@localhost logs]# zgrep -i "2018-10-15T22:20" fts.log* | grep -i "FATA"
|
fts.log.9.gz:2018-10-15T22:20:18.767-07:00 [FATA] scorch AsyncError, treating this as fatal, err: got err persisting snapshot: error persisting segment: open /data/@fts/social_70fa7eefa8e4f81e_6ddbfb54.pindex/store/0000000114cf.zap: no space left on device, stack dump: -- main.initBleveOptions.func1() at init_bleve.go:91
|
fts.log.9.gz:2018-10-15T22:20:25.484-07:00 [FATA] scorch AsyncError, treating this as fatal, err: got err persisting snapshot: open /data/@fts/st_index_scorch_14d99cdd094405bc_f4e0a48a.pindex/store/000000008225.zap: no space left on device, stack dump: -- main.initBleveOptions.func1() at init_bleve.go:91
|
fts.log.9.gz:2018-10-15T22:20:31.014-07:00 [FATA] moss OnError, treating this as fatal, err: write /data/@fts/good_state_731de917f63d2eb4_f4e0a48a.pindex/store/data-0000000000000002.moss: no space left on device, stack dump: -- main.initMossOptions.func1() at init_moss.go:69
|
fts.log.9.gz:2018-10-15T22:20:37.884-07:00 [FATA] moss OnError, treating this as fatal, err: write /data/@fts/good_state_731de917f63d2eb4_f4e0a48a.pindex/store/data-0000000000000005.moss: no space left on device, stack dump: -- main.initMossOptions.func1() at init_moss.go:69
|
fts.log.9.gz:2018-10-15T22:20:41.279-07:00 [FATA] moss OnError, treating this as fatal, err: open /data/@fts/good_state_731de917f63d2eb4_f4e0a48a.pindex/store/data-0000000000000008.moss: no space left on device, stack dump: -- main.initMossOptions.func1() at init_moss.go:69
|
fts.log.9.gz:2018-10-15T22:20:50.152-07:00 [FATA] moss OnError, treating this as fatal, err: open /data/@fts/good_state_731de917f63d2eb4_f4e0a48a.pindex/store/data-0000000000000009.moss: no space left on device, stack dump: -- main.initMossOptions.func1() at init_moss.go:69
|
fts.log.9.gz:2018-10-15T22:20:59.504-07:00 [FATA] moss OnError, treating this as fatal, err: write /data/@fts/good_state_731de917f63d2eb4_f4e0a48a.pindex/store/data-0000000000000005.moss: no space left on device, stack dump: -- main.initMossOptions.func1() at init_moss.go:69
|
Attachments
Issue Links
- relates to
-
MB-31405 [FTS] high disk usage during initial indexing in DGM scenario
- Closed