Elastic Search working very slow and even could not tell if its rebalancing or not

abhishek.yadav · September 7, 2017, 2:12pm

We tried to add new nodes as our datanode got above 80% but its not getting health and also could not say if its replication or shrad is in progress or not. Below are datanode logs :

[INFO ][cluster.service ] [DATA_NODE_10] removed {[NEW_DATA_NODE_09][3lxXOnOmQmGekuBhG-PqsA][ip-172-31-19-50][inet[/172.31.19.50:9300]]{master=false},}, reason: zen-disco-receive(from master [[MASTER_NODE_02][WIAJ0FuYRmG6rH1nHcGymg][ip-172-31-20-153][inet[/172.31.20.153:9300]]{data=false, master=true}])
[2017-09-07 12:15:08,033][INFO ][cluster.service ] [DATA_NODE_10] added {[DATA_NODE_09][wWGAyp7tS0OwcUBkRQcFAw][ip-172-31-19-50][inet[/172.31.19.50:9300]]{master=false},}, reason: zen-disco-receive(from master [[MASTER_NODE_02][WIAJ0FuYRmG6rH1nHcGymg][ip-172-31-20-153][inet[/172.31.20.153:9300]]{data=false, master=true}])
[2017-09-07 13:10:39,249][WARN ][indices.cluster ] [DATA_NODE_10] [[influencer_v1][26]] marking and sending shard failed due to [failed recovery]
org.elasticsearch.indices.recovery.RecoveryFailedException: [influencer_v1][26]: Recovery failed from [DATA_NODE_02][tF3QIC6LQeaVFD-YivYyXQ][ip-172-31-27-226][inet[/172.31.27.226:9300]]{master=false} into [DATA_NODE_10][kLNBypCATVm_3gp7NqBmYw][ip-172-31-27-140][inet[/172.31.27.140:9300]]{master=false} (no activity after [30m])
at org.elasticsearch.indices.recovery.RecoveriesCollection$RecoveryMonitor.doRun(RecoveriesCollection.java:235)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.elasticsearch.ElasticsearchTimeoutException: no activity after [30m]
... 5 more
[2017-09-07 13:10:39,259][WARN ][indices.cluster ] [DATA_NODE_10] [[influencer_v1][25]] marking and sending shard failed due to [failed recovery]
org.elasticsearch.indices.recovery.RecoveryFailedException: [influencer_v1][25]: Recovery failed from [DATA_NODE_07][CZ_M113XQ1aU0uqJKffFIQ][ip-172-31-29-249][inet[/172.31.29.249:9300]]{master=false} into [DATA_NODE_10][kLNBypCATVm_3gp7NqBmYw][ip-172-31-27-140][inet[/172.31.27.140:9300]]{master=false} (no activity after [30m])
at org.elasticsearch.indices.recovery.RecoveriesCollection$RecoveryMonitor.doRun(RecoveriesCollection.java:235)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.elasticsearch.ElasticsearchTimeoutException: no activity after [30m]
... 5 more

spinscale · September 11, 2017, 7:26am

It looks as if no data was sent from one node to another over the course of 30 minutes. Either this means that I/O is very slow or no resources are available to read data, or there is a network issue going on.

The first line seems to indicate as if the new data node is dropping out of the cluster and being removed again, so it does not seem to stay, even though it joined.

system · October 9, 2017, 7:46am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Caused by: org.elasticsearch.ElasticsearchTimeoutException: no activity after [30m] Elasticsearch	2	1128	January 16, 2019
ES marking and sending shard failed due to failed recovery in enabling replication Elasticsearch	7	16087	July 5, 2017
Shard rebalancing is slow after network failure on any node Elasticsearch	7	1447	February 19, 2019
Slow node recovery in ElasticSearch 2.3.3 Elasticsearch	1	703	July 5, 2017
Shard allocation fails during rebalancing Elasticsearch	9	870	November 11, 2021

Elastic Search working very slow and even could not tell if its rebalancing or not

Related topics