Elastic Search working very slow and even could not tell if its rebalancing or not

We tried to add new nodes as our datanode got above 80% but its not getting health and also could not say if its replication or shrad is in progress or not. Below are datanode logs :

[INFO ][cluster.service ] [DATA_NODE_10] removed {[NEW_DATA_NODE_09][3lxXOnOmQmGekuBhG-PqsA][ip-172-31-19-50][inet[/172.31.19.50:9300]]{master=false},}, reason: zen-disco-receive(from master [[MASTER_NODE_02][WIAJ0FuYRmG6rH1nHcGymg][ip-172-31-20-153][inet[/172.31.20.153:9300]]{data=false, master=true}])
[2017-09-07 12:15:08,033][INFO ][cluster.service ] [DATA_NODE_10] added {[DATA_NODE_09][wWGAyp7tS0OwcUBkRQcFAw][ip-172-31-19-50][inet[/172.31.19.50:9300]]{master=false},}, reason: zen-disco-receive(from master [[MASTER_NODE_02][WIAJ0FuYRmG6rH1nHcGymg][ip-172-31-20-153][inet[/172.31.20.153:9300]]{data=false, master=true}])
[2017-09-07 13:10:39,249][WARN ][indices.cluster ] [DATA_NODE_10] [[influencer_v1][26]] marking and sending shard failed due to [failed recovery]
org.elasticsearch.indices.recovery.RecoveryFailedException: [influencer_v1][26]: Recovery failed from [DATA_NODE_02][tF3QIC6LQeaVFD-YivYyXQ][ip-172-31-27-226][inet[/172.31.27.226:9300]]{master=false} into [DATA_NODE_10][kLNBypCATVm_3gp7NqBmYw][ip-172-31-27-140][inet[/172.31.27.140:9300]]{master=false} (no activity after [30m])
at org.elasticsearch.indices.recovery.RecoveriesCollection$RecoveryMonitor.doRun(RecoveriesCollection.java:235)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.elasticsearch.ElasticsearchTimeoutException: no activity after [30m]
... 5 more
[2017-09-07 13:10:39,259][WARN ][indices.cluster ] [DATA_NODE_10] [[influencer_v1][25]] marking and sending shard failed due to [failed recovery]
org.elasticsearch.indices.recovery.RecoveryFailedException: [influencer_v1][25]: Recovery failed from [DATA_NODE_07][CZ_M113XQ1aU0uqJKffFIQ][ip-172-31-29-249][inet[/172.31.29.249:9300]]{master=false} into [DATA_NODE_10][kLNBypCATVm_3gp7NqBmYw][ip-172-31-27-140][inet[/172.31.27.140:9300]]{master=false} (no activity after [30m])
at org.elasticsearch.indices.recovery.RecoveriesCollection$RecoveryMonitor.doRun(RecoveriesCollection.java:235)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.elasticsearch.ElasticsearchTimeoutException: no activity after [30m]
... 5 more

It looks as if no data was sent from one node to another over the course of 30 minutes. Either this means that I/O is very slow or no resources are available to read data, or there is a network issue going on.

The first line seems to indicate as if the new data node is dropping out of the cluster and being removed again, so it does not seem to stay, even though it joined.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.