Question about the future release of ES that incorporate Lucene 7.0

Hi :
I was just reading Lucene's blog on Changes in Lucene 7.0, Since its on the roadmap for ES releases to include that, I was wondering what is the impact of moving away from TF/IDF to BM25 for scoring. For applications that are using the current TF/IDF based scoring for documents, how will it change docs rankings . specifically, for regression tests that might use some sort of ordering of the docs in the results.

Ramdev

BM25 is the default in Elasticsearch 5.0+

In general you should see ranking improvements. BM25 is just a better way
of doing TFIDF. I think some specialized use cases classic Lucene TFIDF
can be easier to reason about.

I wrote quite a bit about BM25 vs TF*IDF in Lucene-base search here

Doug

Thanks Doug. I think you answered my quetsion, (being BM25 is just a different way of doing TF*IDF) that said, would it actually cause documents to be ranked in a different order (I am guessing yes,) If so Should regresion tests that depend on ordering based on scores be changed ?

Cheers

Ramdev

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.