Ngram analyzer and term frequency

Torben · January 12, 2016, 4:22pm

Hello,

I'm using a ngram tokenizer for full text search and got some questions about term frequency.

Full example: http://paste.ubuntu.com/14478646/

If I use the explain API on my search query (last 2 curl commands) for document 1 ("abcd - foooooooabcd") I got a term frequency of 2.0 for the string "abc", which is okay. But when I search for "abcd" I got a term frequency of 17.0. What? Shouldn't this also be 2.0?

The index analyzer works fine, so why this weird term frequency?
curl -XGET 'localhost:9200/test20160107/_analyze?analyzer=index_ngram_wd_analyzer&pretty' -d "abcd"

Thanks in advance for any help!

Best regards,
Torben

Topic		Replies	Views
Understanding term frequency in ES query Elasticsearch	1	331	July 6, 2017
Ngram indexing and search results quality Elasticsearch	1	322	July 6, 2017
Ngram Analyze not working for forward slash '/' Elasticsearch	1	475	June 12, 2018
Can't get nGram indexing / querying to work as expected Elasticsearch	10	369	July 6, 2017
NGram question Elasticsearch	4	276	July 6, 2017

Ngram analyzer and term frequency

Related topics