I am trying to figure out what is the most efficient way to query elasticsearch without scoring. (I assume that scoring add more overhead, and no scoring make it faster).
I need to be able to say filter all that is not. (must_not)
So If I want to build a query that will result let's say with the documents that have the string "some_name" in field companyName, the creation date is after "2016-07-20" and must not have "foo" in companyName:
It is worth noting, though, that your regexp filter will not be very efficient. If you want to improve the speed of your query, you'll probably want to use an ngram token filter and then use a term query on companyName, that will be much more efficient.
If you're concerned about cost of scoring and sorting, then you can tell elasticsearch to return documents in their natural appearance order _doc. It is simply a more efficient way to order documents. Do note that you will notice any real impact only if you're matching a large number of documents.
As Val mentioned, your main performance hit would come from regexp query. In addition to replacing it with ngram your companyName field could be possibly tokenized based on some boundary rule, or even regex. That would be more efficient than regex query and in some instances better than ngram approach since it would generate fewer terms per document.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.