Read old logs in .gz format


(Simon Risberg) #1

Hi!

I was wondering if it's possible for logstash to read old logs that are in .gz format that are a day old?

Best regards


Buffer Overflow when inputing old logs into logstash
(Magnus Bäck) #2

This isn't supported out of the box (see https://github.com/elastic/logstash/issues/1817). You'll have to uncompress them yourself and delete afterwards.


(Simon Risberg) #3

So I extracted the old log file manually and put it in the same directory as the current log file but of course with a different name. I'm trying to get logstash to read it but it doesn't seem to work. I asked a pretty strange question when I think about it. My question now is as follows: Is logstash capable of reading old log files if I have extracted them manually?

This is how my input part looks like right now. Notice that the last file bracket is the old log file that I'm trying to read in. Is this correct?

file {
  path => "/var/externallogs_maven/request.log"
  type => "nexus-log"
}
file {
  path => "/var/externallogs_maven/nexus.log"
  type => "nexus-log"
}
file {
  path => "/var/externallogs_yum/request.log"
  type => "nexus-log"
}
file {
   path => "/var/externallogs_yum/nexus.log"
   type => "nexus-log"
}
file {
   path => "/var/externallogs_maven/request.log.2015-06-22"
   type => "juni22-log"
}

EDIT: Nope I didn't mess up the time filter. Question still remains.


(Magnus Bäck) #4

Looks good, but make sure the Logstash user has permissions to read the files. IIRC you have to start Logstash with --verbose for it to complain about permission issues when opening files.


(Simon Risberg) #5

Okey this is very strange. I've been messing around with it for a few hours now. This is very very strange. Basically I have a folder with two fresh logs called "nexus.log" and and "request.log" who updates regularly. I have no problem with these. In the same folder I have a few .gz files called "request.log.2015-06-22", "request.log.2015-06-23" etc.. If I unzip "request.log.2015-06-18" logstash reads all the logs from that particular date until today's date. Although it doesn't read the log messages before 08.00 which is very strange for me. I tried during a grep command in the different log files to see that the data was the same and it was. Is there some problem with the time format?


(Magnus Bäck) #6

Wait, my bad. I forgot to mention that you need start_position => "beginning" for Logstash to consider reading files from the beginning instead of just tailing them. The tricky thing here (that many people miss in the fine print) is that this only applies to unseen files (i.e. files for which Logstash doesn't have a sincedb entry), and your files aren't unseen at this point. Either recreate each file by copying it or delete the corresponding sincedb file. You'll find those files in ~logstash with the first column of each line being the inode number of the file. You can find that in the output of ls -li.


(Simon Risberg) #7

Thank you. Will come back to you if it doesn't work but it seems like it will solve my problem. Although it never read the file from the beginning in the first place.


(Simon Risberg) #8

The thing is I'm running logstash from inside of a docker container. Doesn't that mean that I actually don't need to delete anything because the docker container doesn't remember indexing all those log files when I switch it off and on again?


(system) #9