Logstash CVS filter is too slow

Hi,
I'm trying to parse my logs by CVS filter, but this plugin is too slow.
My restriction is to use only 1 worker, because we cannot use more than 1 CPU for parsing. With 1 worker CVS filter parses about 600 lines per second, while I need about 4,000 lines per second.
I'm already tested my logstash conf without filter (only input from files and output to elasticsearch) - this configuration performs about 8000 lines per second.
As a comparison, I wrote simple script in KSH and AWK, that parses my logs and send the results to elasticsearch via its bulk API. My script achieves about 14,000 lines per second!!!!!
I cannot use my script directly. It cannot care about log-rotation, restart and continue from last place, etc - all the goodies that Logstash file input provides.
I'm use Logstash 5.2.1 on Linux RedHat
The questions:

  1. What can be done to increase performance of CVS filter, if any?
  2. If not, how can I connect my script to output of Logstash file input plugin?

What version are you on?
What OS?
What JVM?
What does your config look like?
What does the data look like?

JAVA 1.8. GC activity was as usual.
Config: simplest File input, CVS Filter with 15 Fields, Elasticsearch output.
Data: ASCII Files with about 80,000 lines each. Each line has 15 comma separated text Fields

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.