Hi,
We are having logstash filter with around 30 different grok filters for diff types.
What we suspect is whenever we onboard new type of logs, data takes long time to index in elasticsearch.
I want to know how to break this filter in say 10 different filters so that each filter will run in its own thread and there won't be any performance impact on data on boarding.
Filters in the same pipeline run serially and can't be parallelized. To increase filter concurrency you need to increase the number of pipeline workers.
Hi Magnus,
Thanks for reply.
Number of pipelines means, number of logstash configurations under pipelines directory with different ports and separate grok filters for diff types?
Whats relation of threads with filter??
Is there any trade off analysis for threads, filters, inputs and outputs?
Filters are processed by pipeline worker threads. In a given pipeline each event is processed serially by the filters in the pipeline. If you have CPU to spare and want to speed up processing by increasing the concurrency you can increase the number of pipeline workers.
Is there any trade off analysis for threads, filters, inputs and outputs?
The Logstash documentation contains some information about tuning.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.