Does threshold of 1000 fields per index mean "analyzed fields"?

(Thomas Widhalm) #1


I have the problem that we regularly run into the threshold of 1000 fields per per index. Can I prevent this from happening by setting just a few fields I really need to analyzed and all others to not_analyzed?

I found this post ( Limit of total fields [1000] in index has been exceeded ) but I'm still not sure if changing the analyzed status of most of my would allow to enter more than 1000?

I now, I can raise the threshold but since I know about the danger of "mapping explosion" I'd prefer to stick with it.

Thanks in advance.

(Mark Walkom) #2

It's total fields, irrespective if they are text or keyword.

(Thomas Widhalm) #3

Thanks for your reply!

This means I have no way of putting a large number of different fields into Elasticsearch without having overboarding metadata?

I talked to the developers and they told me, they need all the data in Elasticsearch for "manual" review in Kibana, but they only need very few fields to be searchable or graphable.

(Mark Walkom) #4

There is a way, increase that limit :slight_smile: But of course there is a risk there.

Why are the documents so large?

(Ravi Shanker Reddy) #5

Can you explain what is the risk if my document is too large??? It might help to many others

(Thomas Widhalm) #6

In this special project developers have to log each and every variable with its value from their software. This is a very special project where policies like this can not be changed. This policy is so fixed that even Elastic Stack might be fully replaced with another tool if that might be able to allow logging all data. I really don't hope that this happens.

I didn't want to increase the limit because when I allow for, say, 1500 fields, next months they will hit that limit again, and so on. Therefore I want to have a way which scales to have a lot more than 1000 fields per index.

(Thomas Widhalm) #7

Hi. The risk is called "mapping explosion". It means that Elasticsearch has to handle so much metadata it can't handle actual data any more. See for more information.

(Mark Walkom) #8

Basically it adds to the cluster state, which is stored in memory. So a large cluster state can potentially cause excess memory use.

It also increases cardinality, aka the diversity of the data. This means that lucene needs more resources to store the same amount of data and Elasticsearch needs more resources to query it.

(system) #9

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.