I’m seeing some really weird behaviour around index aliases, maybe I’m doing something conceptually dumb.
We have two indexes of very different size (example: 4mn docs in entities-a and 2(!) docs in entities-b ), which both point to an alias, entities.
Now when I query entities for a query that perfectly matches one of the 2 docs in entities-b, I get junk results from entities-a instead. If I remove entities-a from the alias, the query returns the doc from entities-b properly.
The sense I’m getting is that the imbalance of the indexes completely throws off relevancy scoring - is that likely? Is there any way to address it?
Thanks for that hint! While that may be a possible work-around, I would like to understand the problem a bit more before hacking it. The thing is that the match in entities-b is a perfect result (unique match on a boosted keyword field), so it should come out on top even if both indexes are queried...
That looks to have made a difference!! My sample queries are coming back with the correct doc now. Is this safe to do? Does it point to an underlying issue, or is it just a little price we'll have to pay for this whacky setup?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.