Drawbacks of mapping all types as string?

Ibrahim_Awwal · March 21, 2017, 12:34am

Hi!

I'm using an ELK (2.x) stack to make logs from our web application searchable. This usually works pretty well, but I'm logging parameters to requests, and sometimes the dynamic mapping gets confused when the type of a parameter changes, sometimes for reasonable reasons (sometimes it's not particularly reasonable, but that's a separate issue...). I add exceptions to my dynamic mapping sometimes when I know that something is supposed to be a string, for example, and sometimes I do some mutation with grok to change parameter names if the type is just different. But sometimes this feels like playing whack-a-mole as more and more new code paths with new and different parameter types cause new issues.

My question: what's the drawback of just mapping all fields as strings? For instance, what do I gain by having number fields indexed as number types? Does it use more/less space?

Also, in particular, will this work for object-valued parameters, or nested arrays of objects?

I guess the two options I see for mapping everything as a string are

Disable automatic type creation a la https://www.elastic.co/guide/en/elasticsearch/reference/2.4/dynamic-mapping.html#_disabling_automatic_type_creation
Define a dynamic mapping for all field names to string, something like Force all the JSON values as String type - Error - Merging dynamic updates triggered a conflict

As I understand it, if I do the latter I could still add exceptions for specific whitelisted fields if I know their type in advance? But would that be useful?

Overall, my use case is mainly just to be able to search through logs for events and find specific incidents, filtering by things like pages or users affected. Most of the time the integer valued fields are things like database IDs.

Thanks for the help!

Oh, one related issue to this: I sometimes have object values with dynamic property names. if I do keep dynamic mapping enabled, is there a way to accept any property names? Is that bad for performance?

dadoonet · March 21, 2017, 6:31am

If you think about response times for example then 1000ms will be faster then 200ms which is obviously no true.

So basically if you need to compare, sort, aggregate on such fields, use numbers. If it's just about search then String is fine IMO. Will probably take more space than numbers though.

Ibrahim_Awwal · March 21, 2017, 8:16am

Ah okay, that's about what I figured. I think for the most part I'm fine with just searching, I just want my records to get stored at all, which right now they sometimes aren't if the dynamic mapping has given a field a type that's not inclusive enough.

system · April 18, 2017, 8:16am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Problems with dynamic mapping Elasticsearch	3	1242	July 6, 2017
How to map document field which can be a number OR string OR nested object? Elasticsearch	1	333	July 6, 2017
Force all the JSON values as String type - Error - Merging dynamic updates triggered a conflict Elasticsearch	12	7092	July 5, 2017
Dynamic mapping all fields as string Elasticsearch	2	321	April 5, 2019
Mapping advice Elasticsearch	5	791	July 5, 2017

Drawbacks of mapping all types as string?

Related topics