Logstash reads file with multiple lines as "1 hit"

Running Windows 10, Logstash 8.1.0, Elasticsearch, kibana and filebeat 8.0.0 all on the same machine.

Getting the data from logstash into Elasticsearch and kibana works but when multiple files get read at the same time the files with multiple lines are seen as only one record.

here is the pipeline which is in C:\Program Files\logstash-8.1.0\first-pipeline.conf:

 input {
	 beats {
        port => "5044"
    }
}
filter {
		grok {
        patterns_dir => ["./patterns"]
        match => { "message" => "%{DATESTAMP:DateTime},%{PName:Program},%{POSINT:ProcessID},%{POSINT:UsageCPU}" }
      }
    }
output {
	elasticsearch { hosts => ["localhost:9200"]
		user => "elastic"
		password => "my password" 
		ssl => true
		ssl_certificate_verification => false
		
        # The name of the Index 
        index => "<logstash_filelog>"
	}
}

If new entries get into the file while the pipeline and filebeat are active they will be set as new hits in the discover tab but when they all get read for example the next day they are only marked as one record even though there are multiple lines.

here you can see the issue there are multiple lines seen as one.

Here is the structure of the log file:

DateTime, Program, PID, CPU Usage
08.03.2022 14:53:41,Chrome,1715,58
08.03.2022 09:43:01,Edge,10268,74

Only the first row of the message is filtered with grok and the rest is ignored.

Is there something I'm missing that could help me with this issue?

Since your input is the beats you need to share your filebeat configuration, the message is already arriving this way into logstash.

Hey, thanks for the quick answer here is my filebeat.yml:

filebeat.inputs:

# filestream is an input for collecting log messages from files.
- type: filestream

  # Change to true to enable this input configuration.
  enabled: true

  # Paths that should be crawled and fetched. Glob based paths. 
  paths:
    - C:\TEMP\filebeatlog\*.log

  # Include lines. A list of regular expressions to match. It exports the lines that are
  # matching any regular expression from the list. example: ['^ERR', '^WARN']
  include_lines: [A-Za-z0-9]

output.logstash:
  # The Logstash hosts
  hosts: ["localhost:5044"]
  username: "elastic"
  password: "my password" 

the rest stayed default did not change anything. I saw some multiline commands in the reference.yml maybe I should include them like this?

# The maximum number of lines that are combined to one event.
  # In case there are more the max_lines the additional lines are discarded.
  # Default is 500
  multiline.max_lines: 1

  # The number of lines to aggregate into a single event.
  multiline.count_lines: 1

It seems like default is 500 maybe thats the reason why all these lines get seen as one?

No, the multline.* settings only has effect if you are using multine, what you aren't from the filebeat.yml you shared.

Since you do not have multline configured and filebeat sent all the lines of the file as a single event, then it is probably the source file that is this way.

What is creating this source file?

Hey thanks for you answer,
a simple delphi application which logs every 20 seconds random numbers and names out of a range in a line nothing more.
You can see the structure of the log file at the top.

I'm still having this issue and searching for a solution.

If anybody has any idea or a solution to this issue, please answer in this thread.

I was checking the event of a logfile and saw this:

log.flags is set to multiline. Is this something I am able to control or change? Shouldnt the input.type be filestream and not "log" as seen in the screenshot?

Thanks.

Yes, if you were running a filestream collector then the type would be filestream, not log. Perhaps someone else is running a second filebeat instance?

Hey thanks for your answer!

The only thing I am able to set as filestream is in the filebeat.yml which I have. I dont think this is an issue.

The bigger issue is as described the multiple lines seen as one entry do you maybe have any idea whats causing this?

I suspect it is a multiline configuration in the filebeat run by someone else.

I run the whole stack locally so there is only one filebeat running.

Im still looking for a solution. Is this maybe a bug and would changing filebeat version help?

Can you share the log file that you're trying to read with Filebeat? (File Upload, not a copy&paste here).

You didn't set any multiline configuration (good) so the only reason that can cause Filebeat to send a document with multiple lines in "message" is that Filebeat is not recognizing the line terminator in your file.

Alternatively, just try adding the line_terminator option to the filestream config

line_terminator: carriage_return

Hey, thanks for your answer.
Im not allowed to upload the log here ("Sorry, the file you are trying to upload is not authorized (authorized extensions: jpg, jpeg, png, gif).")
I have uploaded it to Google Drive here:

Unfortunately when using
line_terminator: carriage_return
I get the same problem with it being flagged as multiline

Hey,

did you find anything that could cause this issue?

This part of your config is wrong. You need to surround the regular expression with quotes, like this:

include_lines: '[A-Za-z0-9]'

I guess you have this fixed in your setup, otherwise you wouldn't be ingesting any data due to this error. If that's not the case, then definitely this must be another Filebeat instance running with multiline settings.

Please post your current configuration, first remove the line_terminator: test as it is not necessary with your data.

@kvch, can you help here? Can this be caused by file rotation?

1 Like

Hey, thanks for your Answer.

This:

This part of your config is wrong. You need to surround the regular expression with quotes, like this:
include_lines: '[A-Za-z0-9]'

seems to have been my issue. Now the logs are seen as singleline. I just imported a log with ~1400 lines and every line was as singleline.

Thank you so much for helping me with this issue.

Well, that's weird, because when I tested with that error, I was getting no data ingested (all events were filtered) instead of the full file being ingested as a single document.

Yeah it is quite weird. If I remember correctly it was without the ' ' in the filebeat reference documentation so I didnt think this was the issue but apparently it was.

Thanks again.