Filebeat and Multiline

Hi,

We would like to implement the Filebeat as a shipper for log files.
Some of the log files are multilines.

Some servers are sending same types of logs.

What is the best way of doing it ?
Can I use conditions on the Logstash Filter section with source server name ?

Config will be:
FileBeat -> LogStash -> ElasticSearch.

Or, Can I start several Beat inputs in Logstash with different ports, each server will send to a different port and then I will still be questioning according to Source server name ?

Thanks.

1 Like

Next release 1.1 will have multiline support directly implemented in filebeat.

I don't understand. What's the exact problem you'd like to solve? What's the reason to filter on source server name?

Multiline itself is best handled close to the source. That's why it was added to filebeat for the next 1.1 release.

In logstash you can basically filter on any event-field you want to. Some options to add additional metadata to your events:

Checkout filebeat exported fields documentation to get an overview of standard fields being available in logstash. You can for example filter on [beats][name] in logstash (configurred by name in shipper section) or [beats][hostname]. Or use tags or custom fields for filtering.

1 Like

Hi Stefens,

Thanks for Reply.

I have 5 Weblogic Servers.
I am sending application logs from those servers using filebeat.
Those logs are Multiline.
So I need Logstash to be able to distinguish between which data came from which source server.

In Filter I have a Multiline filter.
I thought having 5 times the Filter Section, each section is checking for different source server, using the [beats][hostname].

Will that give me a good separation for the data, so there will be no collision between the different servers ?

I guess the question is, how to implement multiline filter with current filebeat release, having several sources sending the same type of logs ?

I started to implement it, But I need to know If it is a good solution, I have found some of the messages being not Full, but did not encounter any collision in the meanwhile.
What can be the reason for having not a full event message ?

Thanks,

Ori

I don't know much about multiline support in logstash.

which logstash-input-beats plugin version have you installed. I think version 2.0.1 added multiline support + computes a "stream id" for use with multiline. e.g. see this pull request. Might be, you're better of using the multiline codec, instead of the filter.

The multiline filter also has some timeout to flush it's buffers after a few seconds. See its' documentation. If lines are not passed to the filter in line, or limits are reached, you'r multiline events will not be complete.

If you wan't to give filebeat 1.1 a try, you try the nightlies: https://beats-nightlies.s3.amazonaws.com/index.html?prefix=filebeat/

OK.

What if I have different log files from those different servers.

Can I Set different Codec according to different types, like the followings :

input {
beats {
port => 5044

  if [type]=="type_1" {
     codec => multiline {
       pattern => "regexp1"
       negate => "true"
       what => "previous"
    }
  }

  if [type]=="type_2" {
     codec => multiline {
       pattern => "regexp2"
       negate => "true"
       what => "previous"
    }
  }

} # Beat

} # Input

Thanks.

You can configure multiline for each prospector. This means in case you have different multiline for log files, you must put them under different prospectors in the config file.

Hi Ruflin,

I was asking regarding the Logstash input plugins, not the FileBeat Prospectors.
I will not be able to make use of the newer release of FileBeat soon.

I need to implement the Multiline Filter with the V1.0.1 of FileBeat and LogStash 2.1

What about the stream_identity of the Multiline Filter ?
Can it help by setting it to: %{@source_host}.%{@type}

Thanks.

You can configure your own interpretation of stream id in multiline filter by setting stream_identity option. It might make sense to define stream-identity yourself, if you see loads of reconnects, otherwise (I'm not fully sure about this part), a stream id is uniquely generated per connection.

The stream identity is generated by beats input plugin (see this piece of code).

As long as you want to apply the same multiline-logic for all connections, it should work.

Unfortunately You can not use conditions in input section. If you want to use different multiline patterns based on input type for example, you'd have to use the multiline filter. The filter takes the stream identify into account. There is no need to create a multiline filter per connection.

Keep in mind, the multiline-filter is sensitive to time. If for some reason missing lines are not send in time, the multiline event be short. E.g. max_age options set's timeout per multiline-event.

Will move the topic to logstash forum.

Hi Stefens,

Thank you very much.

I have multiple source servers, with multiple types from each source server.
some of the types are multiline, while others are single line.
I have Filebeat installed on each source server, the events are being sent including the type.
I have one Logstash which accepts the incoming flow with one input beat.

In the filter part, I am distinguishing according to Source Server name and type.

Some events are really being shorter, maybe related to MAX_AGE.
Where I increase the MAX_AGE, I am noticing Lines from different events being merged.
So I need to keep it with its default value for now, But I do not want to have events missing some lines.

I have been separating the Source Servers inputs by Port number, so the stream to the Beats Input is not through the same port.
The Logstash starts several Beats input, each input is with its own port And have its Source server.

How can I prevent the shortening of the events ?

Thanks.

Thanks.

hi, I have the same questions like yours.
so, do you know how to deal with this question now?

Hi,

Currently I am still not using the filebeat multiline capability, since I had some stability issues with it.
I am running it on windows, and Version 1.1.2 and 1.2.2 were crashing all the times.
I am on version 1.2.3 if will be ok. then I will start using the Filebeat multiline capability.

I created a folder for the config files, and seperated each log type for each server in a different config file.
Added config file for inputs, and config file for output and two more to drop unneeded fields and to create new fields.

it looks like that:
00_input
10_general - Add fields according to beats.host
50_type1_server1
50_type1_server2
.
.
.
50_typeN_server1
50_typeN_server2
80_general - Drop unneeded fields
99_output

In the 50* files I have the following:

filter {
if [hostname]=="server1" {
if [type]=="type1" {
.
.
.
.
.
}
}
}

In those files there are the multiline filters.
And thats how I am separating between multiple servers with multiline events.

I believe there is a performance degradation, So I am waiting to start using the multiline capability of filebeat, and the number of config files will fall down dramatically. There will be only a need for separated files for the types.

Ori

OK, thanks for your reply, I will try