Mapping with a large number of fields

(Bittu Sarkar) #1

I have a use case where the mapping may have around 5-10 thousand fields or even more! I've read about the mapping explosion problem. I tried to get around this problem by using nested documents for a set of fields that can be dynamically created by the client and a flat list of fields for others. This has resulted in queries becoming very complex as I need to handle both kind of fields (flat + nested) and hence I'm re-thinking my decision to go ahead with nested fields. Rate of addition of new fields to the mapping is very low; not more than once or twice per day. Currently they're indexed as nested fields.

I understand the problem of cluster state management but I guess Elasticsearch 2.0 has solved it to a great extent by publishing cluster state diffs instead of the entire cluster state whenever possible. What I want to understand is what impact does a mapping of this size has on indexing and search performance.

(system) #2