Simple log processing without Logstash


I want to use filebeat to send logs to elasticsearch with simple structure (date in custom format, http code, processing time in ms, query text).
So log line should be parsed and these data should go to different fields in Index.

Also I want to add another field (length of query text in symbols, provided it is in UTF-8 encoding), and I want to truncate the actual text so it fits to 32Kb (because of ES limitation).

As far as I understand I can do all these things in Logstash (even add custom handler written in Ruby).

The question is: is it possible to avoid using logstash at all and achieve these transformations using filebeat only (and possibly in ES using ingest API, pipelining. etc).


Why not Logstash?

Filebeat can't do this. Ingest can't either.

I feel that Logstash eats too much CPU for simple task (receive data via network, parse line against regexp, post parsed json to ES).

It eats almost the same CPU as ES instance on the same machine. But ES performs complex task (index logs) compared with dumb logstash.

If LS is not an option, you should check the Ingest Processors if you can do it their: But Logstash is the one with the full power for such transformation.

I tried to load sample log file into ES vis LS. LS process consumed 2 times more CPU that ES. Is it normal? ES does complex job indexing data. And LS only parses lines against regexp. It feels that LS should be rather light process, but it is not case :frowning:

Regexp tend to be CPU expensive. I recommend trying the Logstash dissect filter:

Thanks, I will looks at dissect module.

Though the same logic for parsing log lines against regexps written in Python consumes like an order of magnitude less CPU that logstash does. So something looks broken here...

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.