This would allow me to easily analyze the search queries by performing a terms aggregation on parameters.q.
But as I've learned, this leads to sparse documents which are bad. It's also possible that you hit the total fields limit of 1000 fields by mapping parameters that way.
But as Lucene 7/Elasticsearch 6 will come with better support for sparse documents (https://www.elastic.co/blog/elasticsearch-6-0-0-alpha1-released#sparse-doc-values), does the recommendation against sparse documents still hold true? Will the field limit of 1000 still remain in ES 6? How would you map request parameters to Elasticsearch now and in ES 6?
where as the sparse documents issue has been improved dramatically, you still run into the issue of exploding field mapping and huge cluster state. Have you thought of changing your model to a nested type with elements like
As a person who shares almost the same problem (each user has it's own "schema"), this nested trick is not helping much since the searches/aggregations become wrong because of it, and it's a lot nicer to use kibana without this schema (because for example in kibana you need to show key "foo" and "spam" in same graph)
That's an important information for me, thx. Seems like the nested approach is the only way to go performance-wise.
But the problem is that i can't do a terms aggregation of the foo parameter, right? Or is there a workaround for that?
Maybe I'll add a configuration option for a explicit whitelist of parameters which should be converted to the non nested version and use nested parameters by default...
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.