ES 6 sparse docs and index.mapping.total_fields.limit

felixbarny · July 14, 2017, 1:21pm

I'm currently trying to understand how to map http request parameters into Elasticsearch.

Ideally, I'd have one field per query parameter

parameters:
  foo: "bar"
  baz: "qux"
  q: "search query"

This would allow me to easily analyze the search queries by performing a terms aggregation on parameters.q.

But as I've learned, this leads to sparse documents which are bad. It's also possible that you hit the total fields limit of 1000 fields by mapping parameters that way.

But as Lucene 7/Elasticsearch 6 will come with better support for sparse documents (https://www.elastic.co/blog/elasticsearch-6-0-0-alpha1-released#sparse-doc-values), does the recommendation against sparse documents still hold true? Will the field limit of 1000 still remain in ES 6? How would you map request parameters to Elasticsearch now and in ES 6?

felixbarny · July 17, 2017, 8:25am

@warkolm Hey bud, could you help me out here?

felixbarny · July 24, 2017, 2:21pm

@spinscale do you have any information on this?

spinscale · July 25, 2017, 10:57am

Hey,

where as the sparse documents issue has been improved dramatically, you still run into the issue of exploding field mapping and huge cluster state. Have you thought of changing your model to a nested type with elements like

parameters: [
  { "key":"foo", "value":"bar"}
  { "key":"spam", "value":"eggs"}
]

to prevent mapping explosion?

Amit_Ripshtos · July 25, 2017, 11:19am

As a person who shares almost the same problem (each user has it's own "schema"), this nested trick is not helping much since the searches/aggregations become wrong because of it, and it's a lot nicer to use kibana without this schema (because for example in kibana you need to show key "foo" and "spam" in same graph)

felixbarny · July 25, 2017, 1:55pm

That's an important information for me, thx. Seems like the nested approach is the only way to go performance-wise.

But the problem is that i can't do a terms aggregation of the foo parameter, right? Or is there a workaround for that?

Maybe I'll add a configuration option for a explicit whitelist of parameters which should be converted to the non nested version and use nested parameters by default...

system · August 22, 2017, 1:55pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
12K fields in the mapping Elasticsearch	7	487	February 8, 2022
Limit of total fields [1000] in index has been exceeded Elasticsearch	7	27979	June 29, 2017
Maximum number of fields in an index mapping Elasticsearch	3	8708	July 20, 2017
Can nested fields prevent mapping explosion? Elasticsearch	3	5578	August 31, 2017
Index.mapping.total_fields.limit setting Elasticsearch	1	396	August 12, 2019

ES 6 sparse docs and index.mapping.total_fields.limit

Related topics