Details
-
Bug
-
Resolution: User Error
-
Critical
-
6.6.0
-
Untriaged
-
Windows 64-bit
-
1
-
Unknown
Description
What's the issue?
This may be multiple separate issues, however, I think its sensible to start with the initial rebalance failure, and investigate from there. Please feel free to separate the issues if required.
A user on the forums has a two node cluster which has failed the initial rebalance due to the indexing service, they're now unable to interact with the cluster in a useful/expected manner. The issue is manifesting with the following symptoms:
1) Loading a sample bucket has failed (due to a timeout waiting for the bucket to report as healthy)
2) The user is unable to insert documents into the created bucket
3) ns_server appears to be timing out when communicating with the indexing service
4) Successive rebalances are now failing
5) Indexing appears to be failing to communicate with the projector
From the logs we see a few interesting things worth noting:
intial rebalance failure |
2021-03-25T12:41:40.158+02:00, ns_orchestrator:0:critical:message(ns_1@192.168.0.122) - Rebalance exited with reason {service_rebalance_failed,index,
|
{agent_died,<22661.10863.0>,
|
{linked_process_died,<22661.10865.0>,
|
{no_connection,"index-service_api"}}}}.
|
ns_server request timeout |
[ns_server:error,2021-03-25T12:43:36.765+02:00,ns_1@192.168.0.122:service_status_keeper_worker<0.430.0>:rest_utils:get_json:62]Request to (indexer) getIndexStatus failed: {error,timeout}
|
index service failing to connect/communicate with the projector |
2021-03-25T12:41:01.702+02:00 [Error] KVSender::closeMutationStream MAINT_STREAM Error Received Post http://192.168.0.122:9999/adminport/shutdownTopicRequest: dial tcp 192.168.0.122:9999: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. from 192.168.0.122:9999
|
memcached slow runtime during warmup |
2021-03-25T12:43:28.127751+02:00 WARNING (beer-sample) Slow runtime for 'Warmup - populate VB Map: shard 7' on thread reader_worker_0: 1079 us
|
2021-03-25T12:43:39.673165+02:00 WARNING (beer-sample) Slow runtime for 'Running the ALL_DOCS api on vb:908' on thread reader_worker_2: 234 ms
|
2021-03-25T12:43:39.950189+02:00 WARNING (beer-sample) Slow runtime for 'Running the ALL_DOCS api on vb:916' on thread reader_worker_2: 168 ms
|
2021-03-25T12:43:40.602306+02:00 WARNING (beer-sample) Slow runtime for 'Running the ALL_DOCS api on vb:963' on thread reader_worker_0: 243 ms
|
projector errors |
2021-03-25T12:40:40.699+02:00 [Info] pram[:9999] Request "/adminport/shutdownTopicRequest"
|
2021-03-25T12:40:40.699+02:00 [Info] PROJ[:9999] ##1 doShutdownTopic() "MAINT_STREAM_TOPIC_0aff61dffe090601aa26977b9c56c153"
|
2021-03-25T12:40:40.699+02:00 [Error] PROJ[:9999] ##1 acquireFeed(): projector.topicMissing
|
2021-03-25T12:40:40.699+02:00 [Info] PROJ[:9999] ##1 doShutdownTopic() returns ...
|
Attachments
Issue Links
- links to