How to get terms and their frequency from all documents in elasticsearch. without using facets?


(Abhishek Jajoria) #1

I want the count of all the terms in elasticsearch index without using
facets. I do no want to use facets because they are making my query slow .


(Ivan Brusic) #2

Jörg Prante created a plugin for this functionality:
https://github.com/jprante/elasticsearch-index-termlist Note: I have
never used it.

Announcement: http://elasticsearch-users.115913.n3.nabble.com/Ann-Elasticsearch-Index-Termlist-Plugin-td3855426.html

--
Ivan

On Mon, Jul 30, 2012 at 4:45 AM, jajoria abhishek
jajoria.abhishek@gmail.com wrote:

I want the count of all the terms in elasticsearch index without using
facets. I do no want to use facets because they are making my query slow .


(Vinicius Carvalho) #3

I think the plugin only returns a list of terms on a field. I'm looking for
something similar as well, just like luke does. Showing the rank for each
term on the index. If your index contains a lot of unique terms, I don't
think this plugin would fit (this was taken from the plugin announcement)

Regards

On Monday, July 30, 2012 2:32:10 PM UTC-4, Ivan Brusic wrote:

Jörg Prante created a plugin for this functionality:
https://github.com/jprante/elasticsearch-index-termlist Note: I have
never used it.

Announcement:
http://elasticsearch-users.115913.n3.nabble.com/Ann-Elasticsearch-Index-Termlist-Plugin-td3855426.html

--
Ivan

On Mon, Jul 30, 2012 at 4:45 AM, jajoria abhishek
jajoria.abhishek@gmail.com wrote:

I want the count of all the terms in elasticsearch index without using
facets. I do no want to use facets because they are making my query slow
.


(Ivan Brusic) #4

It shouldn't be too hard to modify the plugin to support what you
want, but performance might be slow. The difficult part of writing
plugins for ElasticSearch is supporting data that is distributed. The
Lucene code for collecting terms is somewhat easy relative to the rest
of the code. Here is where Jörg actually calls out to Lucene:

https://github.com/jprante/elasticsearch-index-termlist/blob/master/src/main/java/org/elasticsearch/action/termlist/TransportTermlistAction.java

You can add in logic to iterate through all the TermDocs that have the
term and collect statistics.

Cheers,

Ivan

On Mon, Jul 30, 2012 at 12:00 PM, Vinicius Carvalho
viniciusccarvalho@gmail.com wrote:

I think the plugin only returns a list of terms on a field. I'm looking for
something similar as well, just like luke does. Showing the rank for each
term on the index. If your index contains a lot of unique terms, I don't
think this plugin would fit (this was taken from the plugin announcement)

Regards

On Monday, July 30, 2012 2:32:10 PM UTC-4, Ivan Brusic wrote:

Jörg Prante created a plugin for this functionality:
https://github.com/jprante/elasticsearch-index-termlist Note: I have
never used it.

Announcement:
http://elasticsearch-users.115913.n3.nabble.com/Ann-Elasticsearch-Index-Termlist-Plugin-td3855426.html

--
Ivan

On Mon, Jul 30, 2012 at 4:45 AM, jajoria abhishek
jajoria.abhishek@gmail.com wrote:

I want the count of all the terms in elasticsearch index without using
facets. I do no want to use facets because they are making my query slow
.


(system) #5