Count the occurrence of words in ElasticSearch

Hello Everyone,
I am a newbie in Elasticsearch.
I was wondering if there is any technique to search for the count of occurrence of a word (or an array of words) in a document.

For e.g. if I want to search for the array of words ["to", "as", "cash"], in many documents (say 100000), and at my batch size is of 25 documents. I want to count how many times, these word(s) occurred in the document and show that count on my view. i.e. I want that count (not _score) for each document and sent that value with my response in the query so that I can use it or store it for later use.

Perhaps look into using the term vectors API

Setting `term_statistics` to `true` (default is `false` ) will return

* total term frequency (how often a term occurs in all documents)

* document frequency (the number of documents containing the current term)

By default these values are not returned since term statistics can have a serious performance impact.

Hi, thanks for the reply. I want the term frequency for every single document and it is going to return me the term frequency in all documents (in my case 25). So, please can you suggest a way to deduce term frequency for each document. separately, like the _score is for each document.

Just specify the document in the API call. For example:

POST termstest/_doc/1
{
  "message": "Hi, thanks for the reply. I want the term frequency for every single document and it is going to return me the term frequency in all documents (in my case 25). So, please can you suggest a way to deduce term frequency for each document. separately, like the _score is for each document."
}

and then

GET termstest/_termvectors/1?fields=message

You'll see the term frequency of every term - and see that the word document is found 3 times

Thank You so much for the solution Rich. I am working on Elasticsearch 5.6 and have integrated it with rails. I will reply here after it gets implemented and work successfully. Thanks Again.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.