Question on results I'm getting

RurouniKakita · July 6, 2022, 9:11pm

I'm utilizing App Search to implement a search that includes searching ICD 10 codes.

One of the users has reported some slightly strange results. When they send the query of S33 to the App Search engine the majority of the results seem correct but we are seeing some outliers and was wondering if there is some documentation I can review or if some one can explain the reason for this set of results.

"id": "S33", "score": 11.731796
"id": "S33.1", "score": 5.699594
"id": "S06.33", "score": 5.699594 - This is the odd result mainly due to the score and being equal to the result above
"id": "S33.5", "score": 5.6950865
"id": "S38", "score": 3.7110178

Sean_Story · July 6, 2022, 10:17pm

Hi @RurouniKakita ,

What do you have your Precision Tuning slider set to?

My guess is that the default tokenization is struggling to chop up those IDs into tokens the way you're expecting, and so typo-tolerance is taking over, meaning that where in the ID the difference exists matters less than the number of differences. I am surprised that the score is the exact same, though that may be influenced by other fields, if there's more than just the "id" field in your document set.

My suggestion would be to add an extra field to your dataset like "id_prefix" that only contains the id.split('.')[0] (everything before the period, if there is a period), and use weights and/or boosts to weight that field higher than the "id" field. This would help your result ordering to be more like:

"id": "S33", "id_prefix": "S33"
"id": "S33.1", "id_prefix": "S33"
"id": "S33.5","id_prefix": "S33"
"id": "S06.33", "id_prefix": "S06"
"id": "S38", "id_prefix": "S38"

Where the bottom two would have much lower scores than the top 3.

RurouniKakita · July 6, 2022, 10:54pm

The precision is currently set to 5 for that engine I think I have tried it higher and lower and the same thing still happens. For reference these are the searchable fields from the documents in that engine

icdcode - text (searchable, retrievable)
name - text (searchable, retrievable)
section - text (searchable, retrievable)
defaultbodysystems - text (searchable, retrievable)
supertopicterms - text (Array of text, searchable, retrievable)

RurouniKakita · July 15, 2022, 11:17pm

I updated the engine with a lowercase icd code field and that helps a little. I've been diving into the settings and found the following in the engines analysis.filter.delimiter section, split_on_numerics: true. if this was set to false would this possibly improve my results and if so how would I change this value for my elastic cloud instance?

Sean_Story · July 22, 2022, 5:40pm

@RurouniKakita we currently do not support modifying the mappings or index setting for App Search document indices. If you need finer grained control over how your fields are delimited, I suggest taking at look at: Elasticsearch index engines (technical preview) | Elastic App Search Documentation [8.3] | Elastic

system · August 19, 2022, 5:40pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Strange results in score calculation App Search Elastic Search elastic-app-search	4	580	November 12, 2021
Unexpected result on Query Tester Elastic Search elastic-app-search	3	415	November 19, 2021
Odd scoring behavior Elasticsearch	7	500	March 22, 2018
ES gives very different scores, in match_phrase_prefix, for similar documents even I use DfsQueryThenFetch Elasticsearch	1	417	July 6, 2017
Filter by score in app search is necessary Elastic Search elastic-app-search	2	466	January 17, 2022

Question on results I'm getting

Related topics