Filebeat Logstash CSV filter best practices

Hello @filebeatuser;

I use filebeat to ship messages to Logstash, which parses the messages and ingest them into the Elasticseach cluster.

Each message/logline corresponds logically to a tab-separated row in a csv file. The messages are processed in Logstash after they have been consumed by the Logstash beat input plugin. I have created an index template, an ILM policy, a pipeline and a custom field mappings for the "columns" in the messages. Logstash processes the messages in a csv filter section, chooses the custom pipeline and sets the target index. This works all as expected.

However, since filebeat is used the original message/row is also stored within each document in Elasticsearch. This is just overhead since the content of each messages is split and stored fields, too.

It is the plan to use this design (filebeat, loglines, Logstash and custom indices) several time in the application.

My question is simple. What would best practise for removing the message field? Should I not use the csv filter at all and just split the messages, or should I use the csv filter and remove the messages as the last action in the filter section?

Best regrads
fgjensen

Simply removing the message field is fine, I would prefer this (csv plugin) method as it's syntax may be more self-explanatory to the reader than splitting the message manually.

Hello @manud;

Thanks for the reply - I keep the csv filter and remove the message. I agree with you, the csv filter syntax is more readable for the people, who are going to maintain the code.

Best regards
Flemming

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.