Logstash not reading the log data in correct sequence

Hi,

I have an issue with data parsing through Logstash. It reads a csv file but assigns data incorrectly in the message filed.

Below is the overall description:

I am parsing a csv file in logstash which outputs file system used % for each file system on the server.
The csv output is as below:

hostname,date,time,filesystem,usedpercent
abc,07:02:18,18:30:00,/app/logstash,40

The logstash config is:

input {
file{
path=> filepath.csv
type => "diskspace"
id => "filesystemusedid"
sincedb_path => "/dev/null"
ignore_older => 43200
close_older => 300
}

filter {
if [type] == "diskspace" {
csv {
columns => ["FSHostname", "FileSysDate", "FileSysTime", "FSFileSystem", "FSUsed" ]
add_field => {"filesystemtimevalue" => "%{FileSysDate} %{FileSysTime}"}
add_field => {"Type => "filespace"}
convert => {"FSUsed" => "integer"}
}
date{
target => "fsusedpercenttmpstmp"
match => ["filesystemtimevalue", "dd:MM:yy HH:mm:ss" ]
}
mutate {
remove_field => ["FileSysDate"]
remove_field => ["FileSysTime"]
}

fingerprint{
source => ["message"]
target=>"fingerprint"
key=>"37373737"
method => "SHA1"
concatenate_sources => true
}
}
output {
if [Type] == "filespace" {
elasticsearch {
hosts => "hostname" index=> "filesystemused-%{+xxxx.ww}" document_id => "%{fingerprint}"
}
stdout {codec => rubydebug}
}

Now the problem is that Logstash parse the file correctly for some records but for some it incorrectly assigns value in the message as below thereby leading to a dateparsefailure exception.

[2018-02-07T13:00:03,371][DEBUG][logstash.pipeline ] output received {"event"=>{"FSHostname"=>"18", "filesystemtimevalue"=>"10:00:00 /app/logstash", "message"=>"18,10:00:00,/app/logstash,12", "type"=>"diskspace", "FSFileSystem"=>"12", "tags"=>["_dateparsefailure"], "path"=>"/ELK/finalFSdata/FileSystem_180207.csv", "Type"=>"filespace", "@timestamp"=>2018-02-07T13:00:03.230Z, "@version"=>"1", "host"=>"hostname", "fingerprint"=>"4fe229656a0352e72a29e70a8d9878bbaf6e46109"}}

I tried dropping the index and re-creating new index multiple times. Also re-created csv files but it still could not solve the issue.

Please help to verify the problem.

"message"=>"18,10:00:00,/app/logstash,12",

This input line is corrupt. Is this line present in the input file or is it possible that the file was updated while Logstash was reading it, introducing this corruption?

The input line is abc,07:02:18,18:30:00,/app/logstash,12 but it is interpreted as "18,10:00:00,/app/logstash,12" by logstash.

The file is getting appended every 0th and 45th minute of each hour.

The file has proper format but some data is misinterpreted by logstash.

is it because of forward slash in the input? but it is not the case for all values.

How to find out if logstash is reading the file when the file is getting updated. Because at present we have 20 files read by logstash and only input from this file is getting incorrectly parsed. Different cron jobs are set for updating the files.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.