ISO8601 date format : malformed error message in elasticsearch


(Richard) #1

Dear experts,
I am dealing with parsing issues with date format. The date format looks like ISO 8601 : grok filter is perfectly running but no way to keep the log date time stamp as @timestamp in the date section. This only workaround I found is to add the line remove => [ "logTimestamp" ] in the date section.
But this way, I miss what I would like to achieve : use the log time stamp as the elastisearch @timestamp.
Thanks for your support

Here are elasticsearch logs:
[2017-10-09T18:15:14,638][DEBUG][o.e.a.b.TransportShardBulkAction] [i-0012c16b-elk-server-MyInstanceES] [logstash-2017.10.09][4] failed to execute bulk item (index) BulkShardRequest [[logstash-2017.10.09][4]] containing [index {[logstash-2017.10.09][logs][AV8B6zzOeluEE4o-QuEo], source[{"path":"/mondata/log/mongod.log","logTimestamp":"2017-10-09T18:15:14.036+0200","component":"NETWORK","@timestamp":"2017-10-09T16:15:14.036Z","logLevel":"I","@version":"1","host":"0.0.0.0","context":"initandlisten","message":"2017-10-09T18:15:14.036+0200 I NETWORK [initandlisten] connection accepted from 127.0.0.1:50490 #1203802 (2 connections now open)","mongoDB_message":"connection accepted from 127.0.0.1:50490 #1203802 (2 connections now open)"}]}]
org.elasticsearch.index.mapper.MapperParsingException: failed to parse [logTimestamp]
Caused by: java.lang.IllegalArgumentException: Invalid format: "2017-10-09T18:15:14.036+0200" is malformed at "-10-09T18:15:14.036+0200"

Here are some MongoDB log entries:
2017-10-09T18:10:28.012+0200 I NETWORK [conn1203655] end connection 127.0.0.1:50181 (1 connection now open)
2017-10-09T18:10:30.092+0200 I NETWORK [initandlisten] connection accepted from 127.0.0.1:50183 #1203656 (2 connections now open)
2017-10-09T18:10:30.101+0200 I NETWORK [conn1203656] end connection 127.0.0.1:50183 (1 connection now open)

Here is the grok filter for logstash:

filter {
grok {
match => { "message" =>
"%{TIMESTAMP_ISO8601:logTimestamp} %{WORD:logLevel} %{WORD:component} [%{WORD:context}] %{GREEDYDATA:mongoDB_message}"
}
}

if "_grokparsefailure" in [tags] {
drop{ }
}

date {
match => [ "logTimestamp", "ISO8601", "yyyy-MM-dd'T'HH:mm:ss.SSSZ", "yyyy-MM-dd'T'HH:mm:ss.SSSZZZ", "yyyy-MM-dd'T'HH:mm:ss.SSSZZ" ]
target => "@timestamp"

}
}


(Magnus Bäck) #2

What are the mappings of the index? Use ES's get mapping API.


(Richard) #3

Thanks Manus for your reply,
here is the ES mapping output extract for the field logTimestamp from command line:
curl -XGET '10.0.20.10:9200/logstash-2017.10.10/_mapping/logs?pretty=1'

      "logTimestamp" : {
        "type" : "date",
        "format" : "yyyy/MM/dd HH:mm:ss||yyyy/MM/dd||epoch_millis"

As a consequence, I can understand that I need to update the ES mapping to support yyyy-MM-dd'T'HH:mm:ss.SSSZ date format ... but I have still 2 questions:

1- Why the date is not first matched in Logstash by the date section
date {
match => [ "logTimestamp", "ISO8601", "yyyy-MM-dd'T'HH:mm:ss.SSSZ", "yyyy-MM-dd'T'HH:mm:ss.SSSZZZ", "yyyy-MM-dd'T'HH:mm:ss.SSSZZ" ]
target => "@timestamp

2- How to update logTimestamp date format ES mapping ? I get error message :
slight_smile:"error" : {
"root_cause" : [
{
"type" : "index_already_exists_exception",

with the API request :
curl -XPUT '10.0.20.10:9200/logstash-2017.10.10?pretty' -H 'Content-Type: application/json' -d' {"mappings": {"logs": {"properties": {"logTimestamp": { "type": "date", "format": "yyyy-MM-dd'T'HH:mm:ss.SSSZ"} } } } }'

Thanks again for your support


(Magnus Bäck) #4

Why the date is not first matched in Logstash by the date section

Not sure. If the date filter fails to parse a timestamp it'll log a message and point to the part of the string where the mismatch was. Is the date filter really failing, i.e. are the events getting the _dateparsefailure tag?

How to update logTimestamp date format ES mapping ?

You have to create a new index. Existing type mappings can't be modified.


(Richard) #5

Hello Magnus,
your feeling is the good one. There is actually no error in logstash, neither at grok level nor at date level. Only ES timestamp mapping was misconfigured.
I decided to skip ES mapping by removing logTimestamp field in order to keep @timestamp as a single datetime field. Off course @timestamp is set to logTimestamp value to keep "real' timestamp of the log event i.e. not the timestamp when the log event has been collected by Logstash.
Here under is my setup.
Thanks again. Have a good day.

filter {
grok {
match => { "message" =>
"%{TIMESTAMP_ISO8601:logTimestamp} %{WORD:logLevel} %{WORD:component} [%{WORD:context}] %{GREEDYDATA:mongoDB_message}"
}
}

if "_grokparsefailure" in [tags] {
drop{ }
}

date {
match => [ "logTimestamp", "ISO8601", "yyyy-MM-dd'T'HH:mm:ss.SSSZ", "yyyy-MM-dd'T'HH:mm:ss.SSSZZZ", "yyyy-MM-dd'T'HH:mm:ss.SSSZZ" ]
target => "@timestamp"
remove_field => [ "logTimestamp" ]
}
}

output {
elasticsearch {
hosts => "http://10.0.20.10:9200"
}
}


(Yuri) #6

Same issue here.

Logs from elasticsearch:

[2017-10-12T07:51:29,393][DEBUG][o.e.a.b.TransportShardBulkAction] [Mm4RSgy] [filebeat-2017.10.12][0] failed to execute bulk item (index) BulkShardRequest [[filebeat-2017.10.12][0]] containing [index {[filebeat-2017.10.12][erlang-log][AV8PkR1veqkyyPWvhhcB], source[{"log_timestamp":"2017-10-12 07:51:24.425","offset":9620846,"input_type":"log","log_level":"error","source":"/var/log/front/erl_error.log","message":"2017-10-12 07:51:24.425 [error] <0.23677.124> gen_server <0.23677.124> terminated with reason: connection_closed","type":"erlang-log","env":"production","tags":["beats_input_codec_plain_applied"],"@timestamp":"2017-10-12T07:51:24.425Z","service":"backend","@version":"1","beat":{"name":"bknd1","hostname":"bknd1","version":"5.6.3"},"host":"bknd1"}]}]
org.elasticsearch.index.mapper.MapperParsingException: failed to parse [log_timestamp]
	at org.elasticsearch.index.mapper.FieldMapper.parse(FieldMapper.java:298) ~[elasticsearch-5.6.0.jar:5.6.0]
    ...
	at 	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_111]
Caused by: java.lang.IllegalArgumentException: Invalid format: "2017-10-12 07:51:24.425" is malformed at "-10-12 07:51:24.425"
	at org.joda.time.format.DateTimeParserBucket.doParseMillis(DateTimeParserBucket.java:187) ~[joda-time-2.9.5.jar:2.9.5]
	...

Log from logstash:

[2017-10-12T07:51:29,397][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"filebeat-2017.10.12", :_type=>"erlang-log", :_routing=>nil}, 2017-10-12T07:51:24.425Z bknd1 2017-10-12 07:51:24.42 [error] <0.23677.124> APNS disconnected for AppId <<"491322606">> (debug: false). Connection: {sslsocket,{gen_tcp,#Port<0.1710830>,tls_connection,undefined},<0.23684.124>}], :response=>{"index"=>{"_index"=>"filebeat-2017.10.12", "_type"=>"erlang-log", "_id"=>"AV8PkR1veqkyyPWvhhcA", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [log_timestamp]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid format: \"2017-10-12 07:51:24.425\" is malformed at \"-10-12 07:51:24.425\""}}}}}

logstash filter:

filter {
  if [type] == "erlang-log" {
    grok {
      match => ["message", "%{TIMESTAMP_ISO8601:log_timestamp} \[%{LOGLEVEL:log_level}\] %{GREEDYDATA}"]
    }
    date {
      match => [ "log_timestamp", "ISO8601" ]
    }
  }
}

The problem that I found is that there is a conflict in log_timestamp format in different log type.
This is another filter:

filter {
  if [type] == "nginx-error" {
    grok {
      match => ["message", "(?<log_timestamp>%{YEAR}/%{MONTHNUM}/%{MONTHDAY} %{HOUR}:%{MINUTE}:%{SECOND}) \[%{LOGLEVEL:log_level}\] %{DATA}, host: \"%{NOTSPACE:http_host}\"%{GREEDYDATA}"]
    }
    date {
      match => [ "log_timestamp", "YYYY/MM/dd HH:mm:ss" ]
    }
  }
}

The issue appears because of the fact that the first definition produce the field of type 'text' and the second of type 'date'.
The kibana dashboard agrees on this:

Field Type Conflict: The type of this field changes across indices. It is unavailable for many analysis functions. The indices per type are as follows:
Field Type	Index Names
date        filebeat-2017.10.01, filebeat-2017.10.10, filebeat-2017.10.12
text        filebeat-2017.10.11,... 

So while I was writing this post I almost understood the causes of the issue. The problem is that the first filter (erlang-log) does not match log_timestamp as date, right? But why then? It is parsed by the grok pattern 'TIMESTAMP_ISO8601', but isn't matched by '[ "log_timestamp", "ISO8601" ]'?

P.S. The stack version is 5.6.3.


(Yuri) #7

Looks like something like this happened again:


(Richard) #8

Hello Yuri,
I would say that you are not facing any parsing issue from logstash. The pb occurs only when logstash requests ES to write the event. ES replies with an error due to ES mapping issue.
Check log_timestamp mapping within ES thanks to GET API curl -XGET 'YourHost:9200/YourIndex/_mapping/YourField?pretty=1'
Regards. Richard.


(Yuri) #9

I checked.

The mapping for erlang-log:

          "log_timestamp" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },

The mapping for nginx-error:

          "log_timestamp" : {
            "type" : "date",
            "format" : "yyyy/MM/dd HH:mm:ss||yyyy/MM/dd||epoch_millis"
          },

So the error is expected. The only issue here is that the match => [ "log_timestamp", "ISO8601" ] doesn't seem to parse the string like 2017-10-12 07:51:24.425.
So what is happening essentially:

  1. The first entry of the index appears to be of type nginx-error. The mapping is created automatically (?) with log_timestamp as date.
  2. The entries of type 'erlang-log' cannot be saved to the index because the log_timestamp there is of type text.

(system) #10

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.