Hello i have a Document indexed with a custom analyzer.
the string is "MKS-Integrity11.1_Software-FAQ.docx"
it gets tokenized into
{
"tokens": [
{
"token": "mks",
"start_offset": 0,
"end_offset": 3,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "integrity",
"start_offset": 4,
"end_offset": 13,
"type": "<ALPHANUM>",
"position": 1
},
{
"token": "11",
"start_offset": 13,
"end_offset": 15,
"type": "<ALPHANUM>",
"position": 2
},
{
"token": "1",
"start_offset": 16,
"end_offset": 17,
"type": "<ALPHANUM>",
"position": 3
},
{
"token": "software",
"start_offset": 18,
"end_offset": 26,
"type": "<ALPHANUM>",
"position": 4
},
{
"token": "faq",
"start_offset": 27,
"end_offset": 30,
"type": "<ALPHANUM>",
"position": 5
},
{
"token": "docx",
"start_offset": 31,
"end_offset": 35,
"type": "<ALPHANUM>",
"position": 6
}
]
}
for the Query i use a phrase_prefix with a slop of 50
If i query for anything that contains software i.e. "software faq" i get zero matches.
if i query for "MKS faq" i get the match, "mks software faq" no match.
I wonder why that happens, the term software itself is tokenized correctly.
thanks in advance
Lukas