I understand the underlying Lucene dependency. This link I mentioned (http://nlp.uned.es/~jperezi/Lucene-BM25/) refers to an implementation of BM25(F) on top of Lucene. It provides a number of extensions to Lucene for Scorer, Query, Weight, and Similarity.
I think my question is better stated: supposing one had extensions for Lucene that implemented BM25(F), how would they be passed through to Elastic Search?
It seems like the main elements from the API (dsl) are there in terms of field level boosting (so we could have a weighted sum of field level rankings). But there would have to be a way to load the Lucene extensions.
If anyone can shed light on the path to pursue this, or if it has already been done, I would be much obliged.
I understand the underlying Lucene dependency. This link I mentioned
(http://nlp.uned.es/~jperezi/Lucene-BM25/) refers to an implementation of
BM25(F) on top of Lucene. It provides a number of extensions to Lucene for
Scorer, Query, Weight, and Similarity.
I think my question is better stated: supposing one had extensions for
Lucene that implemented BM25(F), how would they be passed through to Elastic
Search?
It seems like the main elements from the API (dsl) are there in terms of
field level boosting (so we could have a weighted sum of field level
rankings). But there would have to be a way to load the Lucene extensions.
If anyone can shed light on the path to pursue this, or if it has already
been done, I would be much obliged.
Just to answer the question how this can be integrated into ES (buggy or not), and just based on me scanning through the docs:
Custom similarity can be easily plugged into elasticsearch. (not documented, but when you get to it, I can point you to how to configure it).
Custom queries added to the query DSL can also be added. They will need to "know" how to be parsed, and then just use them. Check any of the query implementations in elasticsearch to see how its done, and you can write a plugin that adds your own query process to the IndexQueryParserModule.
This is very high level, if you decide to do it, we can delve into the details.
On Tuesday, May 3, 2011 at 9:49 AM, Alberto Paro wrote:
If you read the two jira issues, you'll discover that the BM25f that you point is buggy.
You should read the lucene flexscore branch (GSOC) at the top of lucene trunk
I started updating ES to lucene flexscore trunk, but I stopped for lack of time.
The index format in lucene trunk often changes, so it's not safe for production
I understand the underlying Lucene dependency. This link I mentioned
(http://nlp.uned.es/~jperezi/Lucene-BM25/) refers to an implementation of
BM25(F) on top of Lucene. It provides a number of extensions to Lucene for
Scorer, Query, Weight, and Similarity.
I think my question is better stated: supposing one had extensions for
Lucene that implemented BM25(F), how would they be passed through to Elastic
Search?
It seems like the main elements from the API (dsl) are there in terms of
field level boosting (so we could have a weighted sum of field level
rankings). But there would have to be a way to load the Lucene extensions.
If anyone can shed light on the path to pursue this, or if it has already
been done, I would be much obliged.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.