_datepassefailure


(Tim Desrochers) #1

Logstash 2.0

I am parsing logs in JSON with date format as:

2015-11-09T17:28:44.978477Z

I am using

date {
match => ["ts", "UNIX"]
}

I am getting errors that state "_dateparsefailure" I tried to change this to:

date {
match => ["ts", "ISO8601"]
}

But when I do I begin to get indices created with strange dates. When I look into the indices with the strage date I see my data and the dates are way off (sometimes it says 1970 or dates in the future). I'm not sure whythis is happening. can someone let me know the best date match filter to use in logstash for this scenario?

Thanks


(Magnus Bäck) #2

"2015-11-09T17:28:44.978477Z" is an ISO8601 timestamp so ISO8601 is a good choice of date pattern. It works with the example timestamp you gave:

$ cat test.config
input { stdin { } }
output { stdout { codec => rubydebug } }
filter {
  date {
    match => ["message", "ISO8601"]
  }
}
$ echo '2015-11-09T17:28:44.978477Z' | /opt/logstash/bin/logstash -f test.config
Logstash startup completed
{
       "message" => "2015-11-09T17:28:44.978477Z",
      "@version" => "1",
    "@timestamp" => "2015-11-09T17:28:44.978Z",
          "host" => "hallonet"
}
Logstash shutdown completed

What's the contents of the ts field when the date parsing goes bonkers?


(Tim Desrochers) #3

Sorry for the delay in responding:

when I switch the date filter over to ISO8601 I no longer get "_dateparsefailure" but I begin to get the wacky dated indices. for your reference the logs in this test database are being generated by BroIDS and I have used the json log converter to change the logs from CSV to json to make them easier to parse and ship with Logstash.

The time stamps on the logs when they go out of sorts are:
"ts":"2015-11-13T14:01:54.012641Z"
"ts":"2017-08-27T23:59:59.751356Z"
"ts":"2016-01-26T00:00:00.094326Z"
"ts":"1970-01-01T00:00:00.000000Z"

The 1970 dates always appear to come from the reporter.log with the error:
{"ts":"1970-01-01T00:00:00.000000Z","level":"Reporter::ERROR","message":"stats/Log::WRITER_ASCII: count value too large for JSON: 18446744070608011726"

Besides that the strange dates are not coming from any one type of bro log. It seems the date parser is creating some strange dates which throws off my indices.

I'm happy to run some tests but I'm unsure of where to even start to try to figure out where the problem is.


(Magnus Bäck) #4

The time stamps on the logs when they go out of sorts are:
"ts":"2015-11-13T14:01:54.012641Z"
"ts":"2017-08-27T23:59:59.751356Z"
"ts":"2016-01-26T00:00:00.094326Z"
"ts":"1970-01-01T00:00:00.000000Z"

But these are the input dates, right? If they're wrong I'm not sure what Logstash is supposed to do about it.


(Tim Desrochers) #5

I picked those ts fields out of the index.

I grep'd though the raw logs on the sensor and I don't see any of the dates in the raw logs leading me to believe that it is the date filter that is causing the issue.

The only other thing I can say is that in some of my logs there are other dates in other fields, some of these dates match the year/month/day of the incorrect dates in the index but not the specific time. Could it be possible that somehow this time in another field, not ts, is getting read and applied as the timestamp?


(Magnus Bäck) #6

I grep'd though the raw logs on the sensor and I don't see any of the dates in the raw logs leading me to believe that it is the date filter that is causing the issue.

If your configuration indeed is

date {
match => ["ts", "ISO8601"]
}

then Logstash will store its date in the @timestamp field. It won't touch the ts field. Also, the date filter only emits millisecond resolution but the examples you posted had microsecond resolution. So again, I don't see any indication of Logstash doing anything wrong here.


(Tim Desrochers) #7

The only reason I raise this is because when I use

date {
match => ["ts", "UNIX"]
}

I don't have index creation problems. When using unix time I get the dateparse failure but the indices are created with true dates. So I'm not sure why reading the ISO time is creating the issue. I understand that unix is the wrong filter in this situation seeing that my times are in ISO time.

A little background, I am using this because it was suggested to me. If I don't define a date match filter what woudl happen?

here is my logstash config from my sensor:

input {
  file {
    path => "/nsm/bro/logs/current/*.log"
    exclude => [
               "/nsm/bro/logs/current/stderr.log",
               "/nsm/bro/logs/current/stdout.log",
               "/nsm/bro/logs/current/communication.log",
               "/nsm/bro/logs/current/loaded_scripts.log"
               ]
    codec => "json"
    start_position => "beginning"
    sincedb_path => "/dev/null"
    type => "BRO"
  }
}

filter {
  if [type] == "BRO" {
    date {
      match => ["ts", "ISO8601"]
    }
........
  }
}
output {
  if [type] == "BRO" {
    redis {
      host => "X.X.X.X"
      data_type => "list"
      key => "bro"
    }
  }
}

After this my data goes to a local redis queue then off to a central redis queue. On that central queue I only input from redis output to redis, no filtering. After that the data goes to one of a few indexers where I do all my filtering. My config there is quite large. I do things like translate some things, mutate fields, de_dot, etc. My output from there looks like:

output {
  if [type] == "BRO" {
    elasticsearch {
      hosts => [ "HOSTNAME" ]
      manage_template => false
      index => "XXXX-bro-%{+YYYY.MM.dd}"
    }
  }
}

So here is where my indices are created. Could it be a problem at my indexers. If I am reading my config correctly they should be creating an index based on year/month/day and it should be getting its time from the local box?

Thanks for the help thus far, I am just a bit lost why this is happening.


(Magnus Bäck) #8

So here is where my indices are created. Could it be a problem at my indexers. If I am reading my config correctly they should be creating an index based on year/month/day and it should be getting its time from the local box?

The timestamp used in the index name comes from the @timestamp field, i.e. the field populated by the date filter (unless you override it with the target option). If you don't have a date filter or if it's not able to parse the timestamp string in the field you provide then @timestamp will default to the current date and time.


(Tim Desrochers) #9

I just posted this on the Bro mailing list. I believe it is a bro bug and nothing to do with logstash and its parsers. Thanks for troubleshooting with me.

Here are the results of my findings:

for bro I use the following to change the logs to json:

@load tuning/json-logs

redef LogAscii::json_timestamps = JSON::TS_ISO8601;
redef LogAscii::use_json = T;

If I perform a search of the printed logs on my sensor for dates that are in the future I see the following:
zcat -r /nsm/bro/logs/* | grep -r "ts":"2016

{"ts":"2016-08-27T16:10:11.269010Z","uid":"CfxBj3Sb7mz0Dmv8i","id.orig_h":"192.168.10.100","id.orig_p":62662,"id.resp_h":"104.85.62.152","id.resp_p":443,"name":"window_recision","notice":false,"peer":"satcon99-eth2-2"}
{"ts":"2016-02-06T19:13:33.416241Z","fuid":"FJsBOM25KVPMKSxu07","tx_hosts":["23.31.220.33"],"rx_hosts":["17.110.224.215"],"conn_uids":["CIxuLC3WAExVyJisll"],"source":"SSL","depth":0,"analyzers":["SHA1","MD5","X509"],"mime_type":"application/pkix-cert","duration":0.0,"local_orig":true,"is_orig":true,"seen_bytes":674,"missing_bytes":0,"overflow_bytes":0,"timedout":false,"md5":"2a465a2f7f29fc9677683337114d5cb0","sha1":"2f6854f72d69c47ebadd2a859cade1f7492190a2"}
{"ts":"2016-01-30T17:12:50.116824Z","uid":"Cw4Sfa3mwiDWVraPq5","id.orig_h":"23.31.220.33","id.orig_p":12912,"id.resp_h":"17.164.1.22","id.resp_p":443,"proto":"tcp","duration":0.042421,"orig_bytes":0,"resp_bytes":0,"conn_state":"SH","local_orig":true,"local_resp":false,"missed_bytes":0,"history":"FA","orig_pkts":2,"orig_ip_bytes":80,"resp_pkts":0,"resp_ip_bytes":0,"tunnel_parents":[],"resp_cc":"US","sensorname":"satcon99-eth3"}
{"ts":"2016-01-19T00:00:00.078926Z","uid":"CPrS9S3LLs3oGVRAQ2","id.orig_h":"192.168.10.115","id.orig_p":64606,"id.resp_h":"50.31.185.42","id.resp_p":80,"name":"unescaped_special_URI_char","notice":false,"peer":"satcon99-eth2-2"}
{"ts":"2016-10-20T23:59:59.681405Z","fuid":"FQsWce4xIknwzbuBCg","tx_hosts":["54.187.59.44"],"rx_hosts":["192.168.10.143"],"conn_uids":["C8EhsO2mX4IVRshw07"],"source":"HTTP","depth":0,"analyzers":["MD5","SHA1"],"mime_type":"image/gif","duration":0.0,"local_orig":false,"is_orig":false,"seen_bytes":43,"total_bytes":43,"missing_bytes":0,"overflow_bytes":0,"timedout":false,"md5":"325472601571f31e1bf00674c368d335","sha1":"2daeaa8b5f19f0bc209d976c02bd6acb51b00b0a"}
{"ts":"2016-06-09T12:00:00.819970Z","uid":"Con3l44xdVoLgXarc3","id.orig_h":"23.31.220.33","id.orig_p":12721,"id.resp_h":"66.119.33.141","id.resp_p":80,"name":"data_before_established","notice":false,"peer":"satcon99-eth3-1"}
{"ts":"2016-04-20T23:59:59.918852Z","uid":"CBXuzu4OTSZyy7OHS7","id.orig_h":"23.31.220.33","id.orig_p":5109,"id.resp_h":"173.194.123.104","id.resp_p":443,"name":"possible_split_routing","notice":false,"peer":"satcon99-eth3-1"}
{"ts":"2016-06-16T23:59:59.011608Z","uid":"CCmxue4UZiXJPZTcG4","id.orig_h":"192.168.10.104","id.orig_p":57418,"id.resp_h":"10.1.72.2","id.resp_p":5601,"proto":"tcp","duration":3.001216,"orig_bytes":0,"resp_bytes":0,"conn_state":"S0","local_orig":true,"local_resp":false,"missed_bytes":0,"history":"S","orig_pkts":4,"orig_ip_bytes":208,"resp_pkts":0,"resp_ip_bytes":0,"tunnel_parents":[],"sensorname":"satcon99-eth2"}

my conclusion is that there is something wrong when bro converts unix time to ISO8601 and it creates bad timestamps. Logstash is working correctly and reading the incorrect timestamps.

Thanks again


(system) #10