Im new to Elasticsearch and I cannot solve following issue. I used sampler and significant_text aggregations to find most popular words in my dataset and it worked very well. However, I would like to order the created buckets by their doc_count in a descending (or asc) manner. And that is the issue, I’m not able to find proper solution to do it.
I tried “order” command but without success (and many more). Is it even possible to sort buckets when I used sampler and significant_text aggregations? E.g. sorting worked with term aggregation but not with significant_text. Any ideas? Thanks a lot in advance for any hand.
“Most popular” is not usually a desirable sort order for text because it would generally give you words like “the”, regardless of whatever your query was. That said, if you really want that behaviour you can use a custom scripted scoring heuristic to rank on either foreground or background count.
Mark I have the same problem.
You share your proposed approach in code.
I have another question, kibana dasboard also has significant terms but not significant text. How do I make significant text graph in visualize dashboard.
Dear Mark,
Thanks a lot for prompt response. I would like to ask you If is possible to provide example code or just snippet of your proposed solution. Thanx in advance.
Don’t have access to a computer to give a full working snippet but the docs you need are here
You’d use a painless script that just returned the superset_freq for the background stats.
Kibana does not have support for significant_text currently - I expect the best thing to do is to open an issue on the kibana github issues asking for support to be added to something like the tag cloud visualization
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.