Logstash trying to parse date to the wrong field

Hi, logstash suddenly shows a quite strange behaviour, i think it's trying to parse a text field into a date field. But i really don't know how to interpret this message in
logstash-plain.log:

[2017-07-06T16:23:43,116][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"cons33-edicola", :_type=>"cons33", :_routing=>nil}, 2017-07-06T14:23:42.862Z HUelastic %{message}], :response=>{"index"=>{"_index"=>"cons33-edicola", "_type"=>"cons33", "_id"=>"AV0YSP12as7FI9I-fVyb", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [Data]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid format: \"Corriere della Sera + Edizioni L...\""}}}}}

I just put my template as i always did, letting elasticsearch manage the date field, and it worked until today, but now this.

Here's my mapping:

curl -XPUT 192.168.136.10:9200/_template/cons33-edicola -d ' 
{

"template": "cons33-edicola",
"settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0,
    "refresh_interval": "30s" 
  },
  "mappings": {
    "cons33": {
    "properties": {
    "IDMEDIA":{"type":"keyword"},
    "ENTE": {
            "type": "keyword"
          },
    "IDUSER": {
            "type": "keyword"
          },
    "Anno": {
            "type": "keyword"
          },
    "titolo": {
      "type": "keyword"
          },
    "Data": {
            "format": "yyyy-MM-dd",
            "type": "date"
          }
    
      }
    }
  }
}
'

And this is my logstash conf:

input {
    file {
        path => "/home/../samlpe.csv"
        start_position => "beginning"
        sincedb_path => "/dev/null"
    }
}

filter {
    csv  {

      columns => ["IDMEDIA", "ENTE", "IIDUSER", "titolo", "Anno", "Data"]

      separator => ","
    }

mutate {
		convert => { 

			"IDMEDIA" => "string"
      "ENTE" => "string"
      "IDUSER" => "string"
      "Anno" => "string"
      "titolo" => "string"
		}


    }


mutate {
    remove_field => ["message"]
}
    
   }

output {    
	elasticsearch {
	   action => "index"
	   hosts => ["localhost"]
		index => "cons33-edicola"
		document_type => "cons33"
		template_name => "cons33-edicola"
		template_overwrite => false
    manage_template => false
    	}
    	stdout {}
}


Has it ever occurred to you? Any explanation?
thank you

So Data is supposed to be a date field? Then it looks like you have bad input since the Data field in this case begins with "Corriere della Sera". Perhaps the CSV file is corrupt?

Thanks Magnus, i double checked both fields and they're ok. Yes, "Data" Is suppose to be a date field , But the strange thing is that if i remove completely the text field from the CSV, it works. Could it be a hardware issue maybe? I don't know, logstash seems pretty intense on the cpu (the CSV Is around 18 millions rows), and the server for logstash has only 2 cpu and 8gb RAM left,

Thanks Magnus, i double checked both fields and they're ok. Yes, "Data" Is suppose to be a date field , But the strange thing is that if i remove completely the text field from the CSV, it works.

I'm not sure exactly what you mean, but key to solving the problem is reproducing the problem. What exact line is causing the problem? Which line has "Corriere della Sera" in the Data field?

Could it be a hardware issue maybe?

That's very unlikely.

I don't know, logstash seems pretty intense on the cpu (the CSV Is around 18 millions rows), and the server for logstash has only 2 cpu and 8gb RAM left,

Logstash will attempt to parse the file input as quickly as it can and is therefore going to use all the CPU it can get. 2 CPUs and 8 GB RAM is plenty.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.