I'm trying to use script filter with filtered query as given below. The query execution takes around 20 to 30 secs for 70K matching facet results. How to speed up the query execution ?
I'd try to eliminate that "_source._boost" part from your script and see if
that makes any difference. If it does, store your doc boost in a field and
access it like:
When you access _source.X in a script, the _source field is loaded per doc,
parsed, and then provided to the script. Depending on how many documents
you are hitting, it can be slow if there are a lot.
The document _boost should already be factored into the score, but if you
need to extract a numeric boost value and make it part of your script, try
using doc["boostfield"].value, instead of _source.boostfield. That way, the
values are loaded into memory and would perform better. Just be aware that
it will take up some memory.
Vinoth, I just got word that doc _boost will be deprecated in ES 1.0. It is
recommended that you start using the function_score query instead and just
store your doc "boost" as a field and extract it using doc["boost"].value
method moving forward:
Thanks we will take a look at using function_score query at later point of time.
To switch to use function_score_query to access this new boost field will require us to re-process and re-index all documents.
We will migrate to using function_score query when we plan to re-index documents at later point of time (at that time we will have this new numeric field representing the boost)
Does _score include the product of _boost factor as of now ? is there any other way to include the _boost value without any overhead in the script function ?
Judging by the commits, the boost functionality was not removed, only
deprecated. That said, you really should move to query time boosting.
Index-time boosts are encoded inside the field norms. You should see a
difference inside each field norm. If you have omitted norms, then you will
not have any boosts on that field. The field norm is also lossly since it
uses only 1-byte. Some of the many reasons to switch from document time
boosts.
Cheers,
Ivan
On Thu, Jan 30, 2014 at 2:25 PM, Binh Ly binh@hibalo.com wrote:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.