Parse Old .gz Log File


(davide) #1

Hi everybody,
hope that someone can help me to solve my issue. I've already read all the related topics on this forum but didn't find anything apt to me.
So I use ELK to parse and process the access.log files of my Squid proxy/cache server and everything works fine for the current access.log file. The problem is that when I uncompress the old log files, rename them as access1.log, access2.log etc.. etc.. in order to pass these old file I didn't get any result.

I setted up also Filebeat to send the output to Logstah on "localhost:5044" and under "Filebeat prospector" the path to be fetched is /var/log/squid/*.log, then in /usr/share/logstash I have a pipeline-file where logstash takes the input from Filebeat, parse with grok and send the output to Elasticsearch.
Here ist beat-pipeline config: https://pastebin.com/mF6UPmGc

Any idea?
Thanks in advance to anyone.
Davide

I'm running Ubuntu Server 16.04 and Elasticasearch, Kibana and Lostash are the 6.2.4.


(Guy Boertje) #2

We have just released a new version of the file input that supports direct gzip file processing.
From the Squid rotate docs I assume that you have individual files zipped.

So I assume you want to have a central Logstash instance with beats on the far end squid server.
squid(filebeat++) -> LS -> ES

If you want to give the new code a try you should still use filebeat for tailing the "live" squid logs but you can move or copy the compressed files to the LS machine. Update the file input plugin.`bin/logstash-plugin update logstash-input-file'. Link to CHANGELOG.

Add the file input to your config and add real paths.

input {
  beats {
      port => "5044"
  }
  file {
    path => "/path/to/gz/files/*.gz"
    sincedb_path => "/path/to/gz/position-tracking.sincedb"
    mode => "read"
    file_completed_action => "log"
    file_completed_log_path => "/path/to/gz/completed.log"
  }
}

You may want to add the debugging stdout to you config so you can see what documents are generated.

  stdout {
    codec => rubydebug
  }

If you want to reread the documents again then delete the sincedb file after each run (in my example it is "/path/to/gz/position-tracking.sincedb") or temporarily use /dev/null.

Let us know if this new feature works for you.


(davide) #3

Hi @guyboertje,
thank you very much for your reply. I'll test it this week end and I will let you know.
Thank you!


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.