Counting docs containing terms returned by a query

Hi,
I have to types of documents :

  • News :
    id
    content
  • Firms :
    id
    name

I want, for each firm, to count the number of time its name appears in the News.

I could write a program in python to do that ( and probably in painless ) but is there a way to do it as a query ? I am really junior in elasticsearch.

Thanks in advance

Hey,

first you would need to find out all the companies, and the query for those. But you may want to rethink your data structure and your indexing pipeline. If you extract all the companies before indexing the news, this query suddenly becomes much simpler - as it would be a terms aggregation on your news.

So, before you are going to do the first approach, maybe you can optimize the content your are going to index to improve your query.

--Alex

Thanks a lot. I will have a look this evening.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.