Elasticsearch not responding, high cpu usage

Hello.
I'm having issues with elasticsearch plugin and came here to ask for help.
The machine ELK runs on is dual core VPS Ubuntu 16.04.1 LTS with 8GB RAM and 200GB SSD. It is single node setup.
Versions:
elasticsearch 2.4.0 filebeat 1.3.1 kibana 4.5.4 logstash 1:2.3.4-1
ELK Status page gave me warning: plugin:elasticsearch Request Timeout after 3000ms
Status: Red Heap Total (MB) 431.49 Heap Used (MB) 421.73 Load 1.08, 1.14, 1.15 Response Time Avg (ms) 1.92 Response Time Max (ms) 1.92 Requests Per Second 0.02
I checked elasticsearch.log and found that it contains repeated next message:

[2017-03-13 14:36:11,125][DEBUG][action.admin.cluster.node.stats] [Moira MacTaggert] failed to execute on node [VeBXd8weSo64h8QnAiqs8w]
ReceiveTimeoutTransportException[[Moira MacTaggert][127.0.0.1:9300][cluster:monitor/nodes/stats[n]] request_id [7229597] timed out after [18103ms]]
at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:696)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[2017-03-13 14:36:28,213][WARN ][transport ] [Moira MacTaggert] Received response for a request that has timed out, sent [35191ms] ago, timed out [17088ms] ago, action [cluster:monitor/nodes/stats[n]], node [{Moira MacTaggert}{VeBXd8weSo64h8QnAiqs8w}{127.0.0.1}{127.0.0.1:9300}], id [7229597]

curl -XGET "http://localhost:9200/_nodes/hot_threads" returns
curl: (56) Recv failure: Connection reset by peer
http port is default # http.port: 9200
top output:
KiB Mem : 8175356 total, 1209128 free, 2259700 used, 4706528 buff/cache KiB Swap: 0 total, 0 free, 0 used. 5493028 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 22250 elastic+ 20 0 25.200g 1.592g 385128 S 103.6 20.4 48073:39 java
All options in /etc/default/elasticsearch and elasticsearch.yml are default.
I'm confused as I'm unable to identify cause of such high cpu load for many hours. I assume that elasticsearch is busy with something and cant process requests normally and answer to port 9200 requests. I tried googling and only suggestions i've found were to increase timeout value and/or increase heap value and/or restart elasticsearch. But i'm hitting this trouble not for the first time and restart is not a solution at all. Please, help me to make elasticsearch work normally again.

Hey,

if getting the hot threads returns a connection reset by peer error, this sounds like an issue. Also the long timeouts you pasted sound as if your cluster is overloaded. Is there anything else in the logfiles? Like a long GC collection for example. Without further information (the hot threads output might be a good help if available) it is pretty hard to help.

--Alex

No, there is nothing else in log files. And I think i found cause of trouble. elasticsearch was started with default 1 GB java heap value, which is too small, according to documentation. i Increased this value to 3GB and restarted elasticsearc (had to SIGKILL it, because it was in same state in terms of cpu and ram utilization for half an hour after SIGTERM). Now elasticsearch uses more physical memory and doesn't hangs, at least, for now.

ES_HEAP_SIZE should be set in /etc/default/elasticsearch file to take effect.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.