Logstash CVS filter is too slow

lyres · February 27, 2017, 5:25pm

Hi,
I'm trying to parse my logs by CVS filter, but this plugin is too slow.
My restriction is to use only 1 worker, because we cannot use more than 1 CPU for parsing. With 1 worker CVS filter parses about 600 lines per second, while I need about 4,000 lines per second.
I'm already tested my logstash conf without filter (only input from files and output to elasticsearch) - this configuration performs about 8000 lines per second.
As a comparison, I wrote simple script in KSH and AWK, that parses my logs and send the results to elasticsearch via its bulk API. My script achieves about 14,000 lines per second!!!!!
I cannot use my script directly. It cannot care about log-rotation, restart and continue from last place, etc - all the goodies that Logstash file input provides.
I'm use Logstash 5.2.1 on Linux RedHat
The questions:

What can be done to increase performance of CVS filter, if any?
If not, how can I connect my script to output of Logstash file input plugin?

warkolm · February 28, 2017, 2:52am

What version are you on?
What OS?
What JVM?
What does your config look like?
What does the data look like?

lyres · February 28, 2017, 6:20pm

JAVA 1.8. GC activity was as usual.
Config: simplest File input, CVS Filter with 15 Fields, Elasticsearch output.
Data: ASCII Files with about 80,000 lines each. Each line has 15 comma separated text Fields

system · March 28, 2017, 6:20pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash parse too slow to elasticsearch Logstash	9	2329	March 2, 2018
KV filter killing logstash performance Logstash	5	1179	July 6, 2017
CSV files slow input/fitering comparing to python script Logstash	1	95	April 3, 2024
Very slow processing of CDN logs Logstash	6	1355	July 25, 2017
Logstash slows down overtime Logstash	15	1037	July 4, 2019

Logstash CVS filter is too slow

Related topics