Document frequency of phrases

I am implementing a model called sequential dependency model in
ElasticSearch using dynamic script, and I need to get the collection
frequency (shard frequency in ES) of a phrase. It is possible to get the
term frequency of phrase by comparing the positions of each term in the
phrase, but how could I get collection frequency or document frequency of
phrases? If it is possible, how could I get the collection frequency of
two sloppy near terms?

I guess one solution is to extend a Lucene similarity class. However, can
I do this just using dynamic script?

