Need a term frequency Report across the entire Index

bragboy · October 14, 2017, 1:22pm

I would like to get a term frequency report for a relatively large index.

This is the background of what I am trying to do. I have formulated something called a grouping which is nothing but result sets. Say my index is having a Million documents, these result set grouping would be something like 4000 or 5000 in size. Within this result set, I would like to mine the interesting keywords, perhaps create a report out of it to analyse.

I am still in the exploration phase, so I would like to see the most commonly used terms and its frequency (TTF) for not just a single word, but for 1, 2, 3 words appearing in a sequence. An example I could cite for a 3-word is "Advanced Encryption Standards". There is a very high probability for me to encounter noise for 1-word items, but my assumption is that I could ignore them by defining stopwords.

I went through Term Vectors, but that is something not what I want, as it focusses on a single document, but not on a result set (or the entire index). Plus I don't have any input keywords here as my objective is to figure them out.

I have experience with SOLR and ES and this problem I am encountering is relatively new. I went through various documents, but I could not narrow down (May be I did not spend enough time!). Can someone please point me to the right place to look at for this problem?

Any pointers is greatly appreciated!

system · November 11, 2017, 1:23pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Count the occurrence of words in ElasticSearch Elasticsearch elastic-stack-monitoring , elastic-stack-alerting , docker	5	3315	January 11, 2022
Term Vector ttf from all shards Elasticsearch	1	672	July 6, 2017
How to get the term frequency in ES? Elasticsearch	1	1486	February 17, 2017
How to get total term frequency through aggregation in elastic Elasticsearch	1	334	October 12, 2020
Term Frequency of Different Entities Elasticsearch	1	283	July 6, 2017

Need a term frequency Report across the entire Index

Related topics