Skip reading historical data in logstash while parsing logs

Hi

i am reading my logs from s3 bucket, bucket contains data from the month of Jan , when i am running my configuration , logstash is reading all historical data , but i want to read only todays data , how can do that, what should be the configuration??

Bellow is my configuration

input {
s3{
     access_key_id => "*****************"             
     secret_access_key => "*******************" 
     region => "us-west-1"                   
     bucket => "abc-logs"                  
     codec => "plain"
     type => "access_logs"
}
}
output {
if "_grokparsefailure" not in [tags]
  {
        elasticsearch {
        hosts => "localhost:9200"
        index => "abc"
}
}
  stdout { codec => rubydebug }
}

What should i add to avoid reading historical data...?

If you parse the events so that the @timestamp field is populated correctly you can use the age filter to compute the age of each event and then it's easy to drop those that are too old.

hmmm

i'am confused with the example given in the documentation

filter {
age {}
if [@metadata][age] > 86400 {
drop {}
}
}

what is @metadata here? how can i use @timestamp and what is 86400?

i think "age" filter is not supported in logstash 6.x

Please clarify my doubts

what is @metadata here?

how can i use @timestamp and what is 86400?

86400 is the number of seconds in a day. The age filter examines @timestamp and stores the age of the event in the [@metadata][age] field.

i think "age" filter is not supported in logstash 6.x

Really? You can't install it with logstash-plugin install logstash-filter-age?

Yes i installed age filter and configured it in logstash configuration as below

age {}
if [@timestamp][age] > 86400 {
drop {}
}

after running the configuration i am getting bellow error

[2018-04-24T14:46:44,958][ERROR][logstash.pipeline ] Exception in pipelineworker, the pipeline stopped processing new events, please check your filter configuration and restart Logstash. {:pipeline_id=>"main", "exception"=>"undefined method >' for nil:NilClass", "backtrace"=>["(eval):274:in block in initialize'", "org/jruby/RubyArray.java:1734:in each'", "(eval):272:in block in initialize'", "(eval):291:in block in initialize'", "org/jruby/RubyArray.java:1734:in each'", "(eval):286:in block in initialize'", "(eval):226:in block in filter_func'", "/home/avk03/JarAndZip/jar-file/config/logstash-6.1.1/logstash-core/lib/logstash/pipeline.rb:455:in filter_batch'", "/home/avk03/JarAndZip/jar-file/config/logstash-6.1.1/logstash-core/lib/logstash/pipeline.rb:434:in worker_loop'", "/home/avk03/JarAndZip/jar-file/config/logstash-6.1.1/logstash-core/lib/logstash/pipeline.rb:393:in block in start_workers'"], :thread=>"#<Thread:0x17462633 sleep>"} [2018-04-24T14:46:44,958][ERROR][logstash.pipeline ] Exception in pipelineworker, the pipeline stopped processing new events, please check your filter configuration and restart Logstash. {:pipeline_id=>"main", "exception"=>"undefined method >' for nil:NilClass", "backtrace"=>["(eval):274:in block in initialize'", "org/jruby/RubyArray.java:1734:in each'", "(eval):272:in block in initialize'", "(eval):291:in block in initialize'", "org/jruby/RubyArray.java:1734:in each'", "(eval):286:in block in initialize'", "(eval):226:in block in filter_func'", "/home/avk03/JarAndZip/jar-file/config/logstash-6.1.1/logstash-core/lib/logstash/pipeline.rb:455:in filter_batch'", "/home/avk03/JarAndZip/jar-file/config/logstash-6.1.1/logstash-core/lib/logstash/pipeline.rb:434:in worker_loop'", "/home/avk03/JarAndZip/jar-file/config/logstash-6.1.1/logstash-core/lib/logstash/pipeline.rb:393:in block in start_workers'"], :thread=>"#<Thread:0x17462633 sleep>"}
[2018-04-24T14:46:44,960][ERROR][logstash.pipeline ] Exception in pipelineworker, the pipeline stopped processing new events, please check your filter configuration and restart Logstash. {:pipeline_id=>"main", "exception"=>"undefined method >' for nil:NilClass", "backtrace"=>["(eval):274:in block in initialize'", "org/jruby/RubyArray.java:1734:in each'", "(eval):272:in block in initialize'", "(eval):291:in block in initialize'", "org/jruby/RubyArray.java:1734:in each'", "(eval):286:in block in initialize'", "(eval):226:in block in filter_func'", "/home/avk03/JarAndZip/jar-file/config/logstash-6.1.1/logstash-core/lib/logstash/pipeline.rb:455:in filter_batch'", "/home/avk03/JarAndZip/jar-file/config/logstash-6.1.1/logstash-core/lib/logstash/pipeline.rb:434:in worker_loop'", "/home/avk03/JarAndZip/jar-file/config/logstash-6.1.1/logstash-core/lib/logstash/pipeline.rb:393:in block in start_workers'"], :thread=>"#<Thread:0x17462633 sleep>"} [2018-04-24T14:46:44,966][ERROR][logstash.pipeline ] Exception in pipelineworker, the pipeline stopped processing new events, please check your filter configuration and restart Logstash. {:pipeline_id=>"main", "exception"=>"undefined method >' for nil:NilClass", "backtrace"=>["(eval):274:in block in initialize'", "org/jruby/RubyArray.java:1734:in each'", "(eval):272:in block in initialize'", "(eval):291:in block in initialize'", "org/jruby/RubyArray.java:1734:in each'", "(eval):286:in block in initialize'", "(eval):226:in block in filter_func'", "/home/avk03/JarAndZip/jar-file/config/logstash-6.1.1/logstash-core/lib/logstash/pipeline.rb:455:in filter_batch'", "/home/avk03/JarAndZip/jar-file/config/logstash-6.1.1/logstash-core/lib/logstash/pipeline.rb:434:in worker_loop'", "/home/avk03/JarAndZip/jar-file/config/logstash-6.1.1/logstash-core/lib/logstash/pipeline.rb:393:in block in start_workers'"], :thread=>"#<Thread:0x17462633 sleep>"}
[2018-04-24T14:46:45,045][FATAL][logstash.runner ] An unexpected error occurred! {:error=>#<NoMethodError: undefined method >' for nil:NilClass>, :backtrace=>["(eval):274:in block in initialize'", "org/jruby/RubyArray.java:1734:in each'", "(eval):272:in block in initialize'", "(eval):291:in block in initialize'", "org/jruby/RubyArray.java:1734:in each'", "(eval):286:in block in initialize'", "(eval):226:in block in filter_func'", "/home/avk03/JarAndZip/jar-file/config/logstash-6.1.1/logstash-core/lib/logstash/pipeline.rb:455:in filter_batch'", "/home/avk03/JarAndZip/jar-file/config/logstash-6.1.1/logstash-core/lib/logstash/pipeline.rb:434:in worker_loop'", "/home/avk03/JarAndZip/jar-file/config/logstash-6.1.1/logstash-core/lib/logstash/pipeline.rb:393:in `block in start_workers'"]}

Please Help me :((

[@metadata][age], not [@timestamp][age].

Hi thanks for the reply
i changed my configuration to

age {}
if [@metadata][age] > 86400 {
drop {}
}

but still it is reading data from 01-22-2018 , it supposed to drop that data.

What's wrong here

Please show an example of such a document. You can copy/paste from Kibana's JSON tab or use a stdout { codec => rubydebug } output.

Hi
bellow is the sample document , i am trying to read elb access log data

{
"method" => "GET",
"token" => "HFboNZI168vuLmSL1521536284629",
"name" => "elb_access_log",
"@timestamp" => 2018-01-22T10:50:10.059Z,
"timestamp" => "2018-01-22T10:50:10.059992Z",
"requestProcessingTime" => 4.0e-05,
"backendProcessingTime" => 0.001899,
"responseProcessingTime" => 2.6e-05,
"httpversion" => "1.1",
"@version" => "1",
"request" => "http://xxxxx.com:80/",
"serverIp" => "x.x.x.x",
"clientIp" => "x.x.x.x.",
"backend_status_code" => 200,
"sentBytes" => 4438,
"clientPort" => "30818",
"serverPort" => "80",
"abc" => "abc",
"agent" => ""Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.67 Safari/537.36"",
"elb_status_code" => 200,
"type" => "elb_access",
"recivedBytes" => 0,
"geoip" => {
"region_name" => "Maharashtra",
"continent_code" => "AS",
"country_code2" => "IN",
"region_code" => "MH",
"longitude" => 72.8258,
"city_name" => "Mumbai",
"timezone" => "Asia/Kolkata",
"location" => {
"lat" => 18.975,
"lon" => 72.8258
},
"latitude" => 18.975,
"country_name" => "India",
"country_code3" => "IN",
"ip" => "13.126.167.102"
},
"useragent" => {
"os" => "Linux",
"major" => "36",
"build" => "",
"name" => "Chrome",
"patch" => "1985",
"os_name" => "Linux",
"device" => "Other",
"minor" => "0"
},
"message" => "2018-01-22T10:50:10.059992Z elb_access_log x.x.x.x:30818 x.x.x.x:80 0.00004 0.001899 0.000026 200 200 0 4438 "GET http://xxxxxx.com:80/ HTTP/1.1" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.67 Safari/537.36" - -\n"
}

@timestamp is by default set to the time the event is processed, so unless you are using a date filter to set it based on data in your logs prior to the age filter, the age is always going to be less than 86400 seconds.

@Christian_Dahlqvist

i am parsing the log's time using date filter plugin as bellow

date {
match => [ "timestamp", "ISO8601" ]
}

And you are doing that before you use the age filter?

yes , first i'am using age filter then date filter

You have to do it the other way around. At the point where you are currently applying the age filter, the @timestamp field is populated with the default value, which is the current time.

you mean first i have to parse log time with date filter then i have to use age filter?

Yes.

ok thanks for the reply.... i will try that ,and get back to you please wait

@Christian_Dahlqvist
now it leads to different issue, getting grockparserFailure exception

{
"@timestamp" => 2018-04-24T11:20:39.360Z,
"type" => "elb_access",
"message" => "2018-01-22T14:39:14.878876Z elb_access_log x.x.x.x:35004 - -1 -1 -1 504 0 0 0 "GET http://xxxxxxxx:80/index HTTP/1.1" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.67 Safari/537.36" - -\n",
"@version" => "1",
"tags" => [
[0] "_grokparsefailure",
[1] "_geoip_lookup_failure"
],
"token" => "HFboNZI168vuLmSL1521536284629"
}

@timestamp having today's date but in message the time is of January month... :frowning:

I have no idea what you have changed to get that behaviour. I would have moved the age filter to just after the date filter. Is that what you did? Can you show your complete config?

yeah i did the same thing

input {
s3{
access_key_id => "*****************"
secret_access_key => "*******************"
region => "us-west-1"
bucket => "abc-logs"
codec => "plain"
type => "access_logs"
}

filter {
if [type] == "elb_access"
{
grok
{
match => {
"message" => '%{TIMESTAMP_ISO8601:timestamp} %{DATA:name} %{IPORHOST:clientIp}:%{POSINT:clientPort} %{IPORHOST:serverIp}:%{POSINT:serverPort} %{NUMBER:requestProcessingTime} %{NUMBER:backendProcessingTime} %{NUMBER:responseProcessingTime} %{NUMBER:elb_status_code} %{NUMBER:backend_status_code} %{NUMBER:recivedBytes} %{NUMBER:sentBytes} "%{WORD:method} %{DATA:request} HTTP/%{NUMBER:httpversion}" %{QS:agent}'
}
}

mutate {
convert => { "recivedBytes" => "integer"}
convert => { "backend_status_code" => "integer"}
convert => { "elb_status_code" => "integer"}
convert => { "sentBytes" => "integer"}
convert => { "requestProcessingTime" => "float"}
convert => { "responseProcessingTime" => "float"}
convert => { "backendProcessingTime" => "float"}
add_field => { "token" => "HFboNZI168vuLmSL1521536284629" }
}
geoip {
source => "clientIp"
}

useragent {
source => "agent"
target => "useragent"
}
date {
match => [ "timestamp", "ISO8601" ]
locale => en
}
age {}
if [@metadata][age] > 86400 {
drop {}
}
}
}
{
if "_grokparsefailure" not in [tags]
{
elasticsearch {
hosts => "localhost:9200"
index => "abc"
}
}
stdout { codec => rubydebug }
}

My doubts is not about the grok parser , it's about why @timestamp and time present in event both are different??