Date mapping issue


#1

I've been getting this warning when running my Logstash pipeline. I'm using the date filter on this field and math this pattern: yyyy-MM-dd HH:mm:ss so I'm not sure why I'm seeing this warning.

[2017-12-21T21:27:55,443][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"my-index-2017-09", :_type=>"doc", :_routing=>nil}, #<LogStash::Event:0x60490837>], :response=>{"index"=>{"_index"=>"my-index-2017-09", "_type"=>"doc", "_id"=>"KUH5emABCABy_jCovBSF", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [page.post_date]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid format: \"0000-00-00 00:00:00\" is malformed at \" 00:00:00\""}}}}}

I can see the date is all zeros, so there is obviously something wrong in the data source, but I would still have thought that the date filter would have formatted it to be in a date format Elasticsearch could parse.


(Nachiket) #3

Does your date field contain all zeros? In that case logstash will not parse it as that is not a valid timestamp.
Regards,
N


#4

It does. But I have a conditional in Logstash saying if it is all zeros then remove the field. Are you saying that won't even work?


(Nachiket) #5

Is your conditional coming before you convert the string to date format?
Try putting it before you use the date filter.
Regards


#6

Ah, sorry for the confusion. Yes, it does strip it out. However, I'm wondering why it can't recognize the all zeros date to begin with. Isn't Elasticsearch used for all kinds of data analytics? What if someone wants a time chart of something historical like word usage that goes back into BC? That wouldn't be possible?


(Oldcodeoberyn) #7

Hmm, I have the same problem, I wonder if I should paste something here, or create a new topic


(Nachiket) #8

Excellent point.
Whether elastic search can handle it is a question probably the dev should answer. Also, an interesting question regarding the bc timestamp. Maybe someone else can shed some light.


(Oldcodeoberyn) #9

Today, I am trying to use logstash date plugin to take the timestamp in log to the @timestamp filed.
there totally have 585857 record


but when logstash transform, only have 537727 successful

for the others, elasticsearch are rejected:

[2017-12-22T10:57:58,008][DEBUG][o.e.a.b.TransportShardBulkAction] [app_101283-2017.12.22][2] failed to execute bulk item (index) BulkShardRequest [[app_101283-2017.12.22][2]] containing [31] requests
org.elasticsearch.index.mapper.MapperParsingException: failed to parse [timestamp]
	at org.elasticsearch.index.mapper.FieldMapper.parse(FieldMapper.java:291) ~[elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.index.mapper.DocumentParser.parseObjectOrField(DocumentParser.java:473) ~[elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.index.mapper.DocumentParser.parseValue(DocumentParser.java:597) ~[elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.index.mapper.DocumentParser.innerParseObject(DocumentParser.java:395) ~[elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.index.mapper.DocumentParser.parseObjectOrNested(DocumentParser.java:372) ~[elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.index.mapper.DocumentParser.internalParseDocument(DocumentParser.java:93) ~[elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:67) ~[elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:261) ~[elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.index.shard.IndexShard.prepareIndex(IndexShard.java:703) ~[elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:681) ~[elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.index.shard.IndexShard.applyIndexOperationOnPrimary(IndexShard.java:662) ~[elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.action.bulk.TransportShardBulkAction.executeIndexRequestOnPrimary(TransportShardBulkAction.java:548) ~[elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.action.bulk.TransportShardBulkAction.executeIndexRequest(TransportShardBulkAction.java:140) [elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.action.bulk.TransportShardBulkAction.executeBulkItemRequest(TransportShardBulkAction.java:236) [elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.action.bulk.TransportShardBulkAction.performOnPrimary(TransportShardBulkAction.java:123) [elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:110) [elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:72) [elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.perform(TransportReplicationAction.java:1033) [elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.perform(TransportReplicationAction.java:1011) [elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.action.support.replication.ReplicationOperation.execute(ReplicationOperation.java:104) [elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:358) [elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:298) [elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:974) [elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:971) [elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.index.shard.IndexShardOperationPermits.acquire(IndexShardOperationPermits.java:238) [elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.index.shard.IndexShard.acquirePrimaryOperationPermit(IndexShard.java:2188) [elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.action.support.replication.TransportReplicationAction.acquirePrimaryShardReference(TransportReplicationAction.java:983) [elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.action.support.replication.TransportReplicationAction.access$500(TransportReplicationAction.java:97) [elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.doRun(TransportReplicationAction.java:319) [elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) [elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:659) [elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:638) [elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.0.0.jar:6.0.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_111]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_111]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_111]
Caused by: java.lang.IllegalArgumentException: Invalid format: "22/Dec/2017:10:43:50 +0800" is malformed at "/Dec/2017:10:43:50 +0800"
	at org.joda.time.format.DateTimeParserBucket.doParseMillis(DateTimeParserBucket.java:187) ~[joda-time-2.9.5.jar:2.9.5]
	at org.joda.time.format.DateTimeFormatter.parseMillis(DateTimeFormatter.java:826) ~[joda-time-2.9.5.jar:2.9.5]
	at org.elasticsearch.index.mapper.DateFieldMapper$DateFieldType.parse(DateFieldMapper.java:244) ~[elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.index.mapper.DateFieldMapper.parseCreateField(DateFieldMapper.java:450) ~[elasticsearch-6.0.0.jar:6.0.0]
	at org.elasticsearch.index.mapper.FieldMapper.parse(FieldMapper.java:286) ~[elasticsearch-6.0.0.jar:6.0.0]
	... 38 more

I could see that there is nothing wrong with this timestamp: "22/Dec/2017:10:43:50 +0800", all the other record have the similar format, but some are failure from elasticsearch parse...



(Oldcodeoberyn) #10

here is logstash log:

[2017-12-22T10:57:58,015][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>"FcAbfGABywB1gOlxgvRg", :_index=>"app_101283-2017.12.22", :_type=>"doc", :_routing=>nil}, #<LogStash::Event:0x5084bdcf>], :response=>{"index"=>{"_index"=>"app_101283-2017.12.22", "_type"=>"doc", "_id"=>"FcAbfGABywB1gOlxgvRg", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [timestamp]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid format: \"22/Dec/2017:10:43:50 +0800\" is malformed at \"/Dec/2017:10:43:50 +0800\""}}}}}
[2017-12-22T10:57:58,015][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>"CzAbfGABM12IwqqD6_bs", :_index=>"app_101283-2017.12.22", :_type=>"doc", :_routing=>nil}, #<LogStash::Event:0xd5ac17c>], :response=>{"index"=>{"_index"=>"app_101283-2017.12.22", "_type"=>"doc", "_id"=>"CzAbfGABM12IwqqD6_bs", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [timestamp]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid format: \"22/Dec/2017:10:44:24 +0800\" is malformed at \"/Dec/2017:10:44:24 +0800\""}}}}}
[2017-12-22T10:57:58,015][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>"ocAbfGABywB1gOlx8Pa_", :_index=>"app_101283-2017.12.22", :_type=>"doc", :_routing=>nil}, #<LogStash::Event:0x2509f0d1>], :response=>{"index"=>{"_index"=>"app_101283-2017.12.22", "_type"=>"doc", "_id"=>"ocAbfGABywB1gOlx8Pa_", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [timestamp]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid format: \"22/Dec/2017:10:44:44 +0800\" is malformed at \"/Dec/2017:10:44:44 +0800\""}}}}}
[2017-12-22T10:57:58,015][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>"n8AbfGABywB1gOlx8Pa_", :_index=>"app_101283-2017.12.22", :_type=>"doc", :_routing=>nil}, #<LogStash::Event:0x37e00906>], :response=>{"index"=>{"_index"=>"app_101283-2017.12.22", "_type"=>"doc", "_id"=>"n8AbfGABywB1gOlx8Pa_", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [timestamp]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid format: \"22/Dec/2017:10:44:26 +0800\" is malformed at \"/Dec/2017:10:44:26 +0800\""}}}}}
[2017-12-22T10:57:58,087][INFO ][logstash.pipeline        ] Pipeline terminated {"pipeline.id"=>"main"}

and config:

input{
  elasticsearch {
    hosts => "10.50.5.52:9200"
    # hosts => ["localhost:9200"]
    index => "app_101283-*"
    query => '{ "query": { "match_all": {} } }'
    size => 10000
    scroll => "5m"
    docinfo => true
    codec => json
  }
}

filter {
  if "IAM-" in [fields][service] {
    if "access" in [source]{
      grok {
        patterns_dir => ["./patterns"]
        match => {
          "message" => "%{IAM_ACCESS_LOG}"
        }
        
      }
      mutate {
        add_field => {"isAccss" => true}
      }
      date {
        match => ["timestamp", "dd/MMM/yyyy:HH:mm:ss Z"]
      }  
    }
    else {
      grok {
        patterns_dir => ["./patterns"]
        match => {
          "message" => "%{IAMLOG}"
        }
      }
      date {
        match => ["timestamp", "MMM  dd HH:mm:ss", "MMM dd HH:mm:ss", "ISO8601"]
      }       
    }   
  }

}

output{
  elasticsearch {
    hosts => ["localhost:9200"]
    document_id => "%{[@metadata][_id]}"
    index =>  "%{[@metadata][_index]}"
    user => "elastic"
    password => "changeme"
  }
}

(Oldcodeoberyn) #11

there is no dateparsefailure in tags, means all the timestamp is correct, it just reject by elasitcsearch


#12

I tried matching the date format with the date filter and it still gives that error that Logstash can't index it, but the odd thing is that the pattern it shows is the old pattern, almost as though the date filter isn't catching it at all to transform it into one Elasticsearch likes.


(Oldcodeoberyn) #13

Agree with you, seems it is cause by the date plugin can not catch the event and transform the field to correct format


(Oldcodeoberyn) #14

I find the root cause in my case. :smiley:, in my index, there are 2 kind of log, one is spring boot log and another is spring boot tomcat embedded access log, they locate in the same machine, I collect them together. since they have different timestamp format, I use two date plugin for them. the key point here is that, I grok them with the same field name "timestamp", so, this two date plugin thread seems mix up each other's data. when I change the field name to "iso_timestamp" and another to "httpd_timestamp", this issue are not appeared any more. (it is permanent issue before )

filter {
  if "IAM-" in [fields][service] {
    if "access" in [source]{
      grok {
        patterns_dir => ["./patterns"]
        match => {
          "message" => "%{IAM_ACCESS_LOG}"
        }
        
      }
      mutate {
        add_field => {"isAccss" => true}
      }
      date {
        match => ["timestamp", "dd/MMM/yyyy:HH:mm:ss Z"]
      }  
    }
    else {
      grok {
        patterns_dir => ["./patterns"]
        match => {
          "message" => "%{IAMLOG}"
        }
      }
      date {
        match => ["timestamp", "MMM  dd HH:mm:ss", "MMM dd HH:mm:ss", "ISO8601"]
      }       
    }   
  }

}

(Oldcodeoberyn) #15

anyway, I think this is the bug for logstash date plugin ... because this issue are only happen on some record


(Magnus Bäck) #16

This thread is very confusing and it's not clear to me which problems remain to be solved. Regarding

[2017-12-22T10:57:58,015][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:id=>"n8AbfGABywB1gOlx8Pa", :_index=>"app_101283-2017.12.22", :_type=>"doc", :_routing=>nil}, #LogStash::Event:0x37e00906], :response=>{"index"=>{"_index"=>"app_101283-2017.12.22", "_type"=>"doc", "id"=>"n8AbfGABywB1gOlx8Pa", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [timestamp]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid format: "22/Dec/2017:10:44:26 +0800" is malformed at "/Dec/2017:10:44:26 +0800""}}}}}

the problem is clearly that the date filter writes the parsed date to the @timestamp field and leaves the timestamp field behind, and that's the field that ES has problems with. Either delete the redundant timestamp field or reconfigure the date filter to store the results in timestamp.


(Magnus Bäck) #17

However, I'm wondering why it can't recognize the all zeros date to begin with. Isn't Elasticsearch used for all kinds of data analytics? What if someone wants a time chart of something historical like word usage that goes back into BC? That wouldn't be possible?

I think one could make the case that year zero and even negative years should be supported, but the timestamp in question has zero in the month and day components and that's never valid.


#18

Drrrr, that makes perfect sense. Thank you!


(system) #19

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.