Document frequency of phrases

I am implementing a model called sequential dependency model in
ElasticSearch using dynamic script, and I need to get the collection
frequency (shard frequency in ES) of a phrase. It is possible to get the
term frequency of phrase by comparing the positions of each term in the
phrase, but how could I get collection frequency or document frequency of
phrases? If it is possible, how could I get the collection frequency of
two sloppy near terms?

I guess one solution is to extend a Lucene similarity class. However, can
I do this just using dynamic script?

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
To view this discussion on the web visit
For more options, visit