Any setting in CSV filter to disable auto datatype conversion and keep everything in string?

In both version 7.x and 8, with a very simple csv filter configuration below, for no reason some fields are mapped as "date". In the documentation, there is a switch "convert" that we can customize datatype conversion from the apparent default of "string" to integer, float, date, date_time, boolean for specified fields, but not vice-versa.

    csv {
      separator => "|"
      skip_header => "true"
      quote_char => "¬"
    }

This automatic datatype conversion in the generated mapping (which seems a behavior not documented) is causing a lot of records failed to import properly as that field in the data itself never contained any date data but simply string. The exact error thrown in Elasticsearch is

"reason"=>"failed to parse date field [...some string text...] with format [strict_date_optional_time||epoch_millis]", "caused_by"=>{"type"=>"date_time_parse_exception"

Are there any switch that can disable this conversion completely and just let everything import as string? Thanks in advance.

The csv filter is not doing that conversion, Elasticsearch is. Read up on dynamic mapping and how to turn it off.

Thanks for the pointer. But then once the mapping is created, there is no way to change it, and the mapping is created automatically when the first data file is processed. It seems to be a chicken and egg problem, and that's why i'm trying to look for a setting in configuring the csv filter. I tried to update the index settings after it's created by logstash but got "resource_already_exists_exception" unfortunately, something like

PUT /filebeat-7.16.2-2022.03.09
{
  "mappings": {
    "date_detection": false
  }
}

With logstash and the csv filter I don't have to create the mapping upfront, all i wanted is just have the auto-conversion turned off in my particular csv filter and keep everything in string. Any way to accomplish that?

Thanks again.

The csv filter does not make it a date, Elasticsearch does. You cannot change the mapping of an existing index. You can update the template so that it changes the mapping when an index rolls over.

Thanks again.. yes now i understand it's Elasticsearch doing the conversion, but doesn't it seem odd while there are settings in the csv filter to convert string to some other datatype (which then override Elasticsearch's conversion right?), there's nothing there to stop Elasticsearch to auto-convert and keep it in default string.. And more puzzling to me is that i've very carefully inspected my data, for the fields that got auto-converted to date, there's absolutely nothing looks like a date.

Even if the dynamic setting in mapping is changeable, it's too late for me as the first batch of data already hit with failures as i relied on logstash and csv filter to create the mapping. :woozy_face:

logstash can pull data from many sources and send it to many outputs. Don't assume that its design assumes any other Elastic products are involved.

ELK is a common model, but not the only one. (And Elastic are increasingly moving processing options from logstash forward to ingestion pipelines in Elasticsearch, or backwards to processors in beats.)

The csv and mutate filters can change a string to a LogStash::Timestamp object, and an Elasticsearch output in logstash reformats that to something that Elasticsearch will, by default, process as a date.

Elasticsearch's typing overrides logstash's, not the other way around in this case (it's complicated, so there may be exceptions, and other cases where the opposite is true -- default mapping is really complicated).

Thank you very much for the explanation. It now looks like a dead end trying to configure a "string only" setting.

I'll try workarounds like delete the whole index and start over again by feeding ELK a one-line dummy file with data fields in text only (e.g. "field1,field2,field3,..") to see if it still creates surprising mapping. If it doesn't, then the mapping this dummy file created will be good (expecting datatype for all csv fields are of string) for the rest of my data and no more date_time_parse_exception :crossed_fingers:

Thanks again!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.