Migration from Logstash 2.x to Filebeat, Logstash and ElasticSearch 6.x

brjisc · August 15, 2018, 10:08am

Overview of current system

Logstash 2.x reading from Apache log files.

These log files come from a large number fo web services that we run.

The intention of this process to record the various requests made to services and to record the usage patterns across the service applications.

Main pattern uses grok process the Apache logo files, some come via Syslog

grok {
patterns_dir => "${EDW_root}/patterns"
break_on_match => true
match => { "message" => "%{ENDOFFILE:eof_marker}"}
match => { "message" => "%{SYSLOGTIMESTAMP} %{SYSLOGHOST} %{PROG}: %{IPORHOST}:%{POSINT} %{COMBINEDAPACHELOG}"}
match => { "message" => "%{COMBINEDAPACHELOG}"}
}

Most of the work is done processing the HTTP requests e.g.

if "/news/" in [HTTP][request] {
mutate {
add_field => { "[service][action][genus]" => "STATIC PAGE" }
add_field => { "[service][action][species]" => "NEWS" }
}
}

Some patterns are simple , some are complex combinations of grok and match.

In total there are hundreds of patterns to look for across all the services.

The current system runs as a set of batch processes that process the previous days logs.

The output from Logstash is sent to ElasticSearch indices.

The indices are deleted after an upstream ETL process has extracted the previous days data.

We want to migrate the current system in the following way:

Replace the reading of Apache web log files by Logstash with Filebeat.
Replace the batch processing with continual processing of Filebeat events with Logstash running in Docker, one for each service.
ElasticSearch output cannot change as there is a large upstream ETL system that extracts data from it on a daily basis.

Questions that I have:

If I use the Filebeat Apache module I assume that I will not need to do the initial filtering with the grok patterns (the Syslog element is being removed)
What will be the contents of the event that is sent to Logstash? Is it similar to the structure produced from the COMBINEDAPACHELOG grok pattern.

kvch · August 15, 2018, 12:24pm

Filebeat modules on their own cannot do any preprocessing. It utilizes the Ingest functionality of Elasticsearch. So in order to make use of Apache Filebeat module, Filebeat needs to send to Elasticsearch output.

These are example events sent to Elasticsearch: https://github.com/elastic/beats/blob/master/filebeat/module/apache2/access/test/test.log-expected.json

So either you extend Ingest pipeline of Apache in Filebeat with your pattern. Or you could read files using Filebeat using log input and forward it to Logstash. In this case Logstash still needs to do all the processing.

system · September 12, 2018, 2:24pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filebeat to Analyse Old Apache Logs Beats filebeat	2	249	March 24, 2022
Help jump from logstash to beats Beats filebeat	6	929	December 11, 2017
Problem understanding the concept of filebeat -> logstash Logstash	6	635	November 23, 2017
Moving from ELK to EFK Beats docker , filebeat	3	1202	November 26, 2019
Logstash active exited using filebeat input Logstash	2	1872	July 6, 2017

Migration from Logstash 2.x to Filebeat, Logstash and ElasticSearch 6.x

Related topics