Is the logstash mutate filter enough to get field indexed in ES with the correct type

(Gabteni) #1


I m not sure to have well understood what the mutate filter do and what it doesn't do.

After a lot of pain I've finaly imported CSV datas into elastic. Logstash was treating every single csv field as a string, obviously I was not using the mutate filter.

Then i deleted the index and retried using the mutate filter. With debug mode enabled i could see that logstash converts correctly the inputs, but on ES they're still got indexed as "string".

I know that we cannot change field type in ES, so if we need we muste reindex, but what i m talking about is a bit different I guess. Because launching logstash the index are not created, so in my mind, I m not creating a data conflict.

Do I have to firstly create and map the entire index, and pushing data through logstash only after mapping ? Cant't mutate filter be used as a kind of mapper since we are talking about a new index ?

Thank you very much

(Christian Dahlqvist) #2

The mutate filter determines how fields are formatted in the JSON document sent to Elasticsearch and in some cases this is enough for the dynamic mapping within Elasticsearch to map correctly, e.g. integers and floats. Other types of mappings, e.g. IP addresses are still strings, and the dynamic mapping will not be able to identify this.

(Gabteni) #3

Hi Christian thanks for that point, I even never knew that json accepts types.

I will next try to learn a bit about dynamic and static mapping.

In my case, I m getting data from a CSV, and double quoted numbers are considered as strings. For those do you think I need mapping before ?

(Christian Dahlqvist) #4

JSON only allows a few different field types, e.g. booleans, strings, integers and floats, and these can be identified through dynamic mapping. Dynamic mappings are also able to identify string fields that contain a timestamp in the standard format. Anything beyond that generally need to be managed through mappings and index templates.

(Gabteni) #5

Thank you very much for your help,

as nearly always the issue I was facing was human, nothing to do with ES or logstash.

logstash converts correctly the inputs, but on ES they're still got indexed as "string".

This is what I was thinking, logstash was not converting every field properly :slight_smile:

I mistakenly used "rename" instead of "convert", now I understand better why all the fields was indexed as strings and at the same time why indexed fields had "integer" as a label. :sweat_smile:

Anyway, i'll mark that post as solved, next days will be probably funny