Hello,
I'm using a ngram tokenizer for full text search and got some questions about term frequency.
Full example: http://paste.ubuntu.com/14478646/
If I use the explain API on my search query (last 2 curl commands) for document 1 ("abcd - foooooooabcd"
) I got a term frequency of 2.0 for the string "abc"
, which is okay. But when I search for "abcd"
I got a term frequency of 17.0. What? Shouldn't this also be 2.0?
The index analyzer works fine, so why this weird term frequency?
curl -XGET 'localhost:9200/test20160107/_analyze?analyzer=index_ngram_wd_analyzer&pretty' -d "abcd"
Thanks in advance for any help!
Best regards,
Torben