High cpu load elasticsearch on logstash output

Hi,

Beginner with the elk stack.
Have an api thats sends messages to rabbit mq.
Logstash reads the messages and adds them to elasticsearch. When the api is under peak load the cpu load on elasticsearch is really high.
Its a single node cluster but it has some good hardware.

My logstash output looks like this:
For me it's not completely clear if logstash is sending these messages in bulk to elasticsearch or not.
Anybody could give me some tips on how to improve performance?

elasticsearch {
index => "api_requests-%{+YYYY.MM.dd}"
document_id => "%{id}"
hosts => "elastic-logging-01:9200"
document_type => "doc"
template => "/opt/plugin/MappingTemplates/api_requests.json"
template_name => "api_requests-template"
manage_template => true
template_overwrite => true
}

I see you are sending an ID, that could be a cause, see Bad bulk performance with self-generated id

Thanks for your reply. Sounds like a good point. Moving this into production today. Hopefully this will solve our problem

Moved it into production. But sadly not improvements in cpu usage elasticsearch. Any other suggestions?

Indexing can be quite CPU intensive. What is the average size of your documents? What indexing throughput are you seeing? What is the specification of the hardware your cluster is running on? Is there anything in the logs around long or frequent GC?

Arround 50 messages a second.
Doc size is arround 1000 bytes

Elasticsearch has 4 cpu cores 3.2 g
and 7 gigs of memory

Find nothing strange in logs elasticsearch or logstash

You mention that the CPU load is really high. How high is that? Do you have monitoring installed so you can see what is going on?

Under peak load cpu goes to 100%.
It still cannot keep up with rabbit and logstash starts throwing exceptions.
Will install monitoring later today

Does your template define all fields or are you doing dynamic mapping? The mapping that analyzes text fields and maps them as keyword can do more than you sometimes need.

The mapping template contains:
"dynamic": "false"
So this should be oke.

@rugenl
Analysed the logs from yesterday. And removing the id did give us some improvements. Cpu load is still high but its better than before, so thanks for that one!!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.