in case of one request usage there is losing of precision. There are more than 50 score hierarchy levels and each level of score hierarchy is 10 times greater than previous. For instance: 100 000 000 score + 1 score will be 100 000 000.
in case of multiple requests usage it will be not possible to have a sum of all boosts from the requests for the same documents, because there are a lot of documents (from performance point of view)
@dadoonet Could you please correct me, if I am wrong, and explain the idea how to implement such score hierarchy.
Not sure if that will be possible and fast enough though.
I thought about a custom script to normalize scores on each level of hierarchy, but I didn't do that due to performance point of view. And normalization will have some precision problems too.
What is the use case? I think I have never heard about such a use case and I'm wondering if you are trying to solve a problem the right way...
I'm trying to implement an autocomplete of addresses.
Each address has multiple fields: country, region, city, street, etc.
Using the score hierarchy, I'm trying to achieve more relevant results.
For instance: one matched token by the street field should have greater score than 10 (ideally any number) matched tokens by the country/city/regions fields (by one of these fields or their combination).
And a correct sum of scores from different hierarchy levels are also important.
For instance: a document with one matched token by the street field and one matched token by the city field should have greater score than a document with one matched token by the street field only.
If I understand this demo correctly, it will not work as expected from hierarchy point of view:
a document with multiple matched tokens by city field will have greater score than a document with one matched token by street_name field.
Hi @dadoonet, sorry for the delay.
I've reproduced the problem.
In the example below both addresses have the same score - 1.0 for search by StreetToken1.
I want the address with StreetToken1 in street name to be first and it should have greater score than the second address with StreetToken1 in city name.
PS. I used boolean similarity for better relevance of results.
I think that with much more volume this will work automatically because StreetToken1 as a city_name will be much more frequent and thus less relevant for the city_name field than for the street_name .
To have more relevant result, I've decided not to use frequency of tokens (that's why I use boolean similarity).
Unfortunately, this approach is not enough, because there are a lot of fields (~20) and types of search. So it causes losing of score precision. As I mentioned earlier, to have the score hierarchy, each level of score hierarchy should be 10 times greater than previous: