ElasticSearch 5.x, Inconsistencies when sorting by relevance

jovinbm · March 21, 2017, 2:58pm

I just upgraded to elasticsearch 5.2.2 from 2.4 and some of my tests are failing regarding sorting by relevance. I understand that 5.x implements a new engine for searching but considering how simple my query is, I was expecting to get the same results as in 2.4

Query:

{ query: 
 { bool: 
  { must: [ { multi_match: { fields: [ 'heading' ], query: 'Quick Brown Fox' } } ],
    must_not: [],
    should: [],
    filter: { bool: { must: [], must_not: [], should: [] } } } },
    sort: [ { _score: 'desc' } ] }

Result:

[ { _index: 'axp_1',
_type: 'database_article',
_id: '2',
_score: 3.9133706,
_source: { heading: 'Quick Fox', id: 2 } },
{ _index: 'axp_1',
_type: 'database_article',
_id: '3',
_score: 2.5700963,
_source: { heading: 'Quick Brown Fox', id: 3 } },
{ _index: 'axp_1',
_type: 'database_article',
_id: '1',
_score: 1.8001146,
_source: { heading: 'Quick', id: 1 } } ]

In short, for a query of "Quick Brown Fox", a document with the heading "Quick Fox" is ranked higher (score 3.9..) than "Quick Brown Fox" (score 2.6..) - which contains all the words. I might be misunderstanding something about the whole scoring concept, but this was working in 2.x.

What am I doing wrong? Thanks

Clinton_Gormley · March 24, 2017, 8:24am

First, are you running these tests on an index with a single shard? If not, and you're using few documents, your term statistics on one shard may be skewed.

I suggest turning on the explain option and stepping through the (verbose) output to figure out where the difference is coming from.

jovinbm · March 25, 2017, 10:52pm

I was running the tests with the default number of shards (5), changing the number of shards to 1 for testing fixed the issue. I learnt something new today. Thanks for the help.

Anyone else looking for further explanation on why this happens, read this article by elastic https://www.elastic.co/guide/en/elasticsearch/guide/current/relevance-is-broken.html

system · April 22, 2017, 10:52pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Inconsistent scores between versions Elasticsearch	2	756	February 7, 2017
Different relevance scoring in EL 2 and EL 5 Elasticsearch	3	325	July 5, 2017
Match Query 2.x vs 5.x Results Discrepancy Elasticsearch	3	627	March 15, 2017
Relevance sorting in multimatch query Elasticsearch	4	2689	October 13, 2017
Intermittent scoring returned Elasticsearch	3	264	July 6, 2017

ElasticSearch 5.x, Inconsistencies when sorting by relevance

Related topics