Is there something similar to ElasticSearch's significant_text aggregation, but for significant_fields?
Example: based on the foreground of Json documents, there is an uncommonly common occurrence of the field "RAM capacity" when doing a query on computers. Is there an aggregation or Json query type that will return those uncommonly common field names (RAM capacity, CPU type, HDD capacity, etc)?
Nothing out of the box.
A couple of options spring to mind:
Do-it-yourself in the client-side by performing the calculation on foreground/background numbers obtained from a combination of the global agg (for background stats) and the filters agg with exists queries for the fields of interest
Re-index the content with a fieldnames array of the type keyword that lists the fields available on a doc. Using significant_terms agg to compute the significance of values.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.