Items that should have equal score don't, could someone help me understand?

DougFolksy · November 14, 2016, 1:57pm

Hi -- I have a query that searches for items in a certain subcategory that should (not must) have been tagged with a certain tag. The items also have a listed_at field. If the query finds no items that have the desired tags, then it should just return the most recently listed items in the subcategory.
Oh, and I'm using the msearch method to do a multi-search in the Ruby ElasticModel API.
The query looks like this: https://gist.github.com/biot023/11f12b82fb7df3dee816434f35b8021f

I would expect that this would return the most recently listed items (according to the listed_at field), because I would expect all items to have the same score: they should all have the correct subcategory ID and they should none of them have the tag in the query ("snow leopard").
However, the items seem to come back in a random order, each having been assigned a seemingly random score. An example of what actually gets returned is this: https://gist.github.com/biot023/9a959631de8527ec9314e2f7ce34e2c7

(Apologies for the massive code dump, I'm not sure what would be relevant or not.)
As you can see, the scoring seems more-or-less random, which means that the listed_at field does not get to become the field sorted upon, as desired.

For completeness, the mappings of the items are as follows:

mappings do
  indexes( :title, type: "string", analyzer: "snowball" )
  indexes( :description, type: "string", analyzer: "snowball" )
  indexes( :tags, type: "string", index: "not_analyzed" )
  indexes( :section_id, type: "integer" )
  indexes( :category_id, type: "integer" )
  indexes( :subcategory_id, type: "integer" )
  indexes( :section, type: "string", index: "not_analyzed" )
  indexes( :category, type: "string", index: "not_analyzed" )
  indexes( :subcategory, type: "string", index: "not_analyzed" )
  indexes( :section_name, type: "string", analyzer: "snowball" )
  indexes( :category_name, type: "string", analyzer: "snowball" )
  indexes( :subcategory_name, type: "string", analyzer: "snowball" )
  indexes( :listed_at, type: "string", index: "not_analyzed" )
  indexes( :price, type: "float" )
  indexes( :price_range, type: "string", index: "not_analyzed" )
  indexes( :price_range_desc, type: "string", index: "not_analyzed" )
  indexes( :colour_ids, type: "integer" )
  indexes( :material_ids, type: "integer" )
  indexes( :created_at, type: "string", index: "not_analyzed" )
end

Could anyone explain how I'm ending up with a random scoring, and maybe explain how I can fix it so that the score stays consistent?
Thankyou for any and all help offered,
Doug.

danielmitterdorfer · November 14, 2016, 2:20pm

Hi @DougFolksy,

one thing that jumps out is that you map listed_at as string. Your documents all have the same score (which is ok) but then Elasticsearch sorts lexicographically by date (i.e. it treats the date as string) and I guess that's what you find so puzzling. So all you need to do is to map listed_at as date (and reindex the documents as the mapping does not apply to documents that already exist in the index).

Daniel

DougFolksy · November 14, 2016, 3:02pm

Thanks for that @danielmitterdorfer
Unfortunately, I get the same issue once I've changed the mapping to date and re-run my tests.
For completeness, the mappings now look like this:

mappings do
  indexes( :title, type: "string", analyzer: "snowball" )
  indexes( :description, type: "string", analyzer: "snowball" )
  indexes( :tags, type: "string", index: "not_analyzed" )
  indexes( :section_id, type: "integer" )
  indexes( :category_id, type: "integer" )
  indexes( :subcategory_id, type: "integer" )
  indexes( :section, type: "string", index: "not_analyzed" )
  indexes( :category, type: "string", index: "not_analyzed" )
  indexes( :subcategory, type: "string", index: "not_analyzed" )
  indexes( :section_name, type: "string", analyzer: "snowball" )
  indexes( :category_name, type: "string", analyzer: "snowball" )
  indexes( :subcategory_name, type: "string", analyzer: "snowball" )
  indexes( :listed_at, type: "date", index: "not_analyzed" )
  indexes( :price, type: "float" )
  indexes( :price_range, type: "string", index: "not_analyzed" )
  indexes( :price_range_desc, type: "string", index: "not_analyzed" )
  indexes( :colour_ids, type: "integer" )
  indexes( :material_ids, type: "integer" )
  indexes( :created_at, type: "date", index: "not_analyzed" )
  indexes( :expires_at, type: "date", index: "not_analyzed" )
end

I tried it without the index: "not_analyzed" bit of the date mappings, too, but that didn't seem to make a difference.

danielmitterdorfer · November 14, 2016, 3:03pm

Hi @DougFolksy,

did you recreate the index from scratch? Did you check the mapping and the documents directly in Elasticsearch?

Daniel

DougFolksy · November 14, 2016, 3:06pm

Yes, @danielmitterdorfer, I deleted the index and re-imported.
When I check my mappings, now, I see this for the listed_at field:

"listed_at" : {
  "type" : "date",
  "format" : "strict_date_optional_time||epoch_millis"
},

DougFolksy · November 14, 2016, 3:07pm

Oh, and the results show items with listed_at dates like this:

"listed_at": "2016-11-13T23:07:03.000Z"

Which is different from before, so I'm guessing that that is correct.

DougFolksy · November 14, 2016, 3:12pm

Ah!
I think I might know what the cause is!
The items that get lower scores than the others now seem to be ones with more tags.
Could it be that they get a lower score because they have two tags that don't match the tag in the query? Whereas the other items that only have one tag that doesn't match get a higher score?
In effect, the items are getting penalised for every tag they have which doesn't match the one in the query.
Does that seem likely?

DougFolksy · November 14, 2016, 3:17pm

Ah. No.
That's not it.
They're shifting around randomly, again.
Sorry -- got a bit over-excited, there!

DougFolksy · November 14, 2016, 3:29pm

Just to confirm (sorry, I'm spamming aren't I?), the different items are getting different scores each time I run my tests -- here an example from two consecutive runs:

Item    -- Score 1    -- Score 2
-----------------------------------
Item 01 -- 0.09848769 -- 0.04500804
Item 02 -- 0.28986934 -- 0.09848769
Item 03 -- 0.09848769 -- 0.09848769
Item 04 -- 0.28986934 -- 0.09848769
Item 05 -- 0.04500804 -- 0.04500804
Item 06 -- 0.04500804 -- 0.09848769

For each test run, the things that will differ (because its a different time, or they've been generated differently) will be:

listed_at and other date fields ending in _at
_id and id, as part of new test data being generated for the test
user_id
shop_id
section_id
category_id
subcategory_id
primary_image (to some random UUID)
section, category, and subcategory, as these are fields composited from their IDs, slugs and names

But the only fields referenced in the query are listed_at and tags (an array of strings).
So I can't help thinking that it must still be something to do with the listed_at field, as that's the only relevant thing that changes from test to test.

danielmitterdorfer · November 14, 2016, 3:32pm

Hi @DougFolksy,

no worries.

You can use the explain API to find out how Elasticsearch calculated the score of a document.

Can you please also provide the search results as a gist? I'd like to see whether the results are now sorted by _score and then by date (but this time interpreted as date, not as string).

Regarding the random shuffling, can you please also provide the output of GET /your_index_name_here/_stats? (where you replace "your_index_name_here" with the name of your index).

Daniel

danielmitterdorfer · November 14, 2016, 3:37pm

Hi @DougFolksy,

you write:

, the things that will differ (because its a different time, or they've been generated differently) will be:

listed_at and other date fields ending in _at

[...]

the only fields referenced in the query are listed_at [...]

Isn't it expected that you get different scores each time if you have different values for listed_at each time?

Daniel

DougFolksy · November 14, 2016, 3:38pm

Sure thing, @danielmitterdorfer

The latest results are here: https://gist.github.com/biot023/a10894b54c825d8cde4fc633082813b6

And the output from the stats query is here: https://gist.github.com/biot023/3d3f4962afb4495de0180026d4126753

I'm having a look at the explain API, now, and will report back -- thanks again, man!

DougFolksy · November 14, 2016, 3:41pm

Isn't it expected that you get different scores each time if you have different values for listed_at each time?

No, sorry, I should've explained -- the values change each time, but they're all consistently relative to each other.
"Item 01" gets a listed_at time of 20 hours ago, "Item 02" gets one of 19 hours ago, and so on.

danielmitterdorfer · November 14, 2016, 3:46pm

Hi @DougFolksy,

ok, I checked the results. You have a very small number of documents but multiple shards and this could cause (pseudo-)issues with scoring (see the section "Relevance is broken" in the Definitive Guide). The gist of this: can you please create the index for your tests with only one primary shard, i.e.:

PUT /your_index_name
{ 
    "settings": { 
        "number_of_shards": 1 
    }
}

Daniel

DougFolksy · November 14, 2016, 4:04pm

When I get around to writing the epic opera based upon this thread, your character's arias are going to be so heroic!
It took me a while how to figure out how to do this with our test setup, but once figured, I've not had an inconsistent result, yet!
Thankyou so much, @danielmitterdorfer!

danielmitterdorfer · November 14, 2016, 4:05pm

Haha, glad I could help you @DougFolksy and I'm looking forward to your opera. In the meantime, enjoy your now consistent search results.

Daniel

system · December 12, 2016, 4:05pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Unexpected scoring with nested objects Elasticsearch	2	439	July 5, 2017
Sort same scored documents by field Elasticsearch	3	2439	July 6, 2017
Inconsistent result order (same score) Elasticsearch	1	563	December 13, 2017
Full text query multi_match document scores Elasticsearch	2	985	October 9, 2019
Why elasticsearch gives different scores to identical documents Elasticsearch	2	586	June 26, 2018

Items that should have equal score don't, could someone help me understand?

Related topics