Parse Old .gz Log File

Hi everybody,
hope that someone can help me to solve my issue. I've already read all the related topics on this forum but didn't find anything apt to me.
So I use ELK to parse and process the access.log files of my Squid proxy/cache server and everything works fine for the current access.log file. The problem is that when I uncompress the old log files, rename them as access1.log, access2.log etc.. etc.. in order to pass these old file I didn't get any result.

I setted up also Filebeat to send the output to Logstah on "localhost:5044" and under "Filebeat prospector" the path to be fetched is /var/log/squid/*.log, then in /usr/share/logstash I have a pipeline-file where logstash takes the input from Filebeat, parse with grok and send the output to Elasticsearch.
Here ist beat-pipeline config: https://pastebin.com/mF6UPmGc

Any idea?
Thanks in advance to anyone.
Davide

I'm running Ubuntu Server 16.04 and Elasticasearch, Kibana and Lostash are the 6.2.4.

We have just released a new version of the file input that supports direct gzip file processing.
From the Squid rotate docs I assume that you have individual files zipped.

So I assume you want to have a central Logstash instance with beats on the far end squid server.
squid(filebeat++) -> LS -> ES

If you want to give the new code a try you should still use filebeat for tailing the "live" squid logs but you can move or copy the compressed files to the LS machine. Update the file input plugin.`bin/logstash-plugin update logstash-input-file'. Link to CHANGELOG.

Add the file input to your config and add real paths.

input {
  beats {
      port => "5044"
  }
  file {
    path => "/path/to/gz/files/*.gz"
    sincedb_path => "/path/to/gz/position-tracking.sincedb"
    mode => "read"
    file_completed_action => "log"
    file_completed_log_path => "/path/to/gz/completed.log"
  }
}

You may want to add the debugging stdout to you config so you can see what documents are generated.

  stdout {
    codec => rubydebug
  }

If you want to reread the documents again then delete the sincedb file after each run (in my example it is "/path/to/gz/position-tracking.sincedb") or temporarily use /dev/null.

Let us know if this new feature works for you.

Hi @guyboertje,
thank you very much for your reply. I'll test it this week end and I will let you know.
Thank you!

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.