I have a Logstash pipeline set up with multiple Elasticsearch filter plugins to do a lookup of field values ingested via an input filter. Processing and writing the events into Elasticsearch is around 500,000 events per hour. I added a few more Elasticsearch filter plugins and, to be fair, the queries are just a little bit more complex than the previous ones so I'm expecting a slight hit in terms of performance. However, the events written to Elasticsearch is currently less than 100,000 events per hour.
I checked the CPU usage of Elasticsearch nodes and they do need exceed 30%. I know that these new Elasticsearch filters set up are bottlenecking the flow. How can I check further to verify this? Perhaps getting some stats to show the bottleneck?
To add on, what might be some of the configurations I can look into to potentially increase the search performance? On Logstash's side I have configured persistent queueing but due to the bottleneck, I'm assuming events are getting dropped.
I suggest using the pipeline stats API to look at the time spent in each plugin (input, filter, and output). If most of the cost of your pipeline is in the elasticsearch filter and you add a second one then that could double the cost of the pipeline and halve the throughput.
It will be easier to interpret the output if you set the id option on each filter plugin. Otherwise the ids will be randomly generated.
in past I had issue of logstash limiting on write to elastic turn out I had open file issue and some kernal parameter in my linux host. I write around 420,000 event per hour without issue.
one of them was in start up service file, increase that
LimitNOFILE=
another kernel parameter is fs.file-max
look for some shared memory segment in sysctl.conf file. increase them if you can.
also check /etc/security/limits.conf file and increase open file limit for logstash
As seen above, the millis per event values were higher as well as the worker utilization. Are there optimizations i can do on Logstash's side? Perhaps increasing the number of workers? Or this is purely due to Elasticsearch's search performance?
Thanks for the suggestion. I will try this out but most likely it might not fix the current issue as I found out the main culprit is ELasticsearch search performance
Er, you may also want to revisit that claim/assumption, double check the filters!? Back of envelope was 100x more milliseconds spent as per your screenshots.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.