I want, for each firm, to count the number of time its name appears in the News.
I could write a program in python to do that ( and probably in painless ) but is there a way to do it as a query ? I am really junior in elasticsearch.
first you would need to find out all the companies, and the query for those. But you may want to rethink your data structure and your indexing pipeline. If you extract all the companies before indexing the news, this query suddenly becomes much simpler - as it would be a terms aggregation on your news.
So, before you are going to do the first approach, maybe you can optimize the content your are going to index to improve your query.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.