Logstash config settings

emailshak · October 26, 2018, 12:03pm

I have logstash 6.0 installed on a 4 CPU node with 32GB memory. I would like to improve the performance of my ingest to elasticsearch. At present i am ingesting a 8 GB file that contains 42 columns of text data. With the default logstash settings i was able to ingest 6 million records in one hour. How do i improve this ingest speed with changes to my configuration settings. Also noticed the ingest process to slowdown after 4 hours. It may be ingesting 3 million records per hour now. How do i improve ingest efficiencies and also account for this slowdown in ingest efficiency that is happening overtime. Thanks

Christian_Dahlqvist · October 26, 2018, 12:07pm

How have you identified that it is Logstash and not Elasticsearch that is the bottleneck? Are you letting Elasticsearch set document IDs or are you supplying them yourself when indexing?

emailshak · October 26, 2018, 2:37pm

I am letting elasticsearch set document ID's. Based on the logstash node hardware specs, I feel that it is not being fully used to its potential. For this run the memory utilization was showing up as 2GB where as we have 32GB available. How do i decide to up the memory setting and what environment settings will have to be aligned for updates.

Christian_Dahlqvist · October 26, 2018, 2:40pm

Logstash performance is typically limited either by CPU of network performance. It is however only able to process data as fast as the systems receiving it can accept it.

Before starting to look at tuning Logstash, have you verified that your downstream system are able to handle higher throughput and are not the bottleneck here?

emailshak · October 26, 2018, 3:01pm

No. How do i check on the downstream system.

Christian_Dahlqvist · October 26, 2018, 3:01pm

Where are you sending your data?

emailshak · October 26, 2018, 3:03pm

I am sending the data to a 8 node elasticsearch cluster where each of the nodes has a 2CPU,32gbmemory specification. This was the only job running for ingestion to this cluster.

Christian_Dahlqvist · October 26, 2018, 3:13pm

What type of hardware and storage are the Elasticsearch nodes using? Look at CPU usage and disk I/O on your Elasticsearch nodes and verify that none of the Elasticsearch nodes are saturated or limiting performance.

Documents per second is a poor metric for judging performance as size and complexity of documents can affect the amount of work that need to be done quite a lot. What is the average size of your documents?

Another factor that can impact indexing performance is how many indices and shards you are actively indexing into. Can you provide some information about this?

system · November 23, 2018, 3:13pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Ingestion performance issues - where to start? Elasticsearch	6	655	September 18, 2020
Elastic log Events ingest rate optimization Elasticsearch	1	168	December 19, 2022
Improving Elasticsearach ingest capacity Elasticsearch	7	111	June 20, 2024
Huge concurrent data ingestion to ElasticSearch Elasticsearch	16	2829	September 18, 2018
Optimizing configuration for ingestion Elasticsearch	1	432	December 19, 2017

Logstash config settings

Related topics