I noticed the edge_ngram analyzer scoring exact matches of a query lower than poor matches so I copied the data into a second field with the standard analyzer to compare.
I now have 2 fields with the same content but different analyzers:
-
text_xx
: standard analyzer -
text_en
: edge_ngram analyzer (3 to 20 chars)
Searching on the edge_ngram
field yields the wrong rank/score IMHO.
I have a document which has text_en/text_xx="UVV 00031 099 Filterschlauch" and when I search for "UVV 00031 099" I expect this document to appear first. There are 50K+ documents containing "UVV".
-
Searching
(UVV 00031 099)
I get the result I want in second position:
-
Searching
text_en: (UVV 00031 099)
I don't even get the document at all:
-
Searching
text_xx: (UVV 00031 099)
I get the correct document in first place:
Why does the Edge-NGram analyzer not score the exact match highest?
See the explanation of the search here.