Hi, I have two fields in my index on which I am querying. One is the title and another is content.
Content is a big field and contains entire content of an article.
Both fields have the same analyzer with min_gram as 3 and max_gram as 20.
Now the problem is when I search for a term and if the term completely matches the content field and partially matches the title field, title field takes precedence.
For example, I was searching for a term called hacking.
Now there is a document which contains hacking in its content field. And there are multiple docs whose title contains tracking (which is not hacking by any means).
But when I do a query with hacking, all the tracking results come on top and hacking is in somewhere third or fourth page of results. This is not what I expect. Hacking result should come on top. When I check the score the one with tracking as title gets 0.6 as score and one with hacking in the content gets 0.08 as the score, even though I have queried with Hacking.
@dadoonet You mean the content field because content is a larger field.
Anyway even after I boost the content field, the results are not satisfactory though it is better than before.
Can't we make exact match get the first precedence no matter in which field it is present and then partial matches can come in the search results?
You need to have a field without ngrams (see copy fields or multifields), include it search with high boost. It should resolve the problem.
However, I suppose using ngramms is obviously bad idea.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.