Querying an alias throws off scoring completely?

Hey all!

I’m seeing some really weird behaviour around index aliases, maybe I’m doing something conceptually dumb.

We have two indexes of very different size (example: 4mn docs in entities-a and 2(!) docs in entities-b ), which both point to an alias, entities.

Now when I query entities for a query that perfectly matches one of the 2 docs in entities-b, I get junk results from entities-a instead. If I remove entities-a from the alias, the query returns the doc from entities-b properly.

The sense I’m getting is that the imbalance of the indexes completely throws off relevancy scoring - is that likely? Is there any way to address it?

Hi,

If you want to prioritize results from entities-b over entities-a , you can use the indices_boost parameter in your search query.

Regards

Thanks for that hint! While that may be a possible work-around, I would like to understand the problem a bit more before hacking it. The thing is that the match in entities-b is a perfect result (unique match on a boosted keyword field), so it should come out on top even if both indexes are queried...

Have you tried setting the search_type query parameter to dfs_query_then_fetch? Does that make any difference?

1 Like

That looks to have made a difference!! My sample queries are coming back with the correct doc now. Is this safe to do? Does it point to an underlying issue, or is it just a little price we'll have to pay for this whacky setup?

Yes. It is a different query mode especially designed for better handling shards with different sizes and distributions of data.

No, not really. Have a look at the docs and the links therein to distributes term frequences and relevancy scoring.

1 Like

Thank you very much for your competent help, @Christian_Dahlqvist! Appreciate you taking the time.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.