Thanks. In fact there were a few things to fix:
-
In the input section of the pipeline, we must add
sincedb_path => "/dev/null"to tell Logstash to re-process the data, as you said. -
The source data file must not be older than 24h, as you said, so (on Linux) a
touch logstash-tutorial.logwill fix the problem. -
The filter section should include a filter to match the timestamp as extracted from the log, or otherwise Logstash will consider
the timestamp of the filethe current timestamp when it reads the file (quite useless):filter { grok { match => { "message" => "%{COMBINEDAPACHELOG}" } } date { match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ] } geoip { source => "clientip" } } -
Another useful edit on the output section is to define the index name to something more meaningful:
output { elasticsearch { hosts => ["192.168.77.200:9200"] index => "apachelogs-%{+YYYY.MM.dd}" } stdout {} }