following problem: I want to have an autocompletion of single words. This means: if I type in "bor", I want to get "boring (num: 29)" and "border (num: 10)" back. "border" and "boring" are parts of several large texts. I do not want to get the whole text, just the number of documents there "boring" and "border" occurs and - of course - the terms "boring" and "border".
Do you want a list of snippets where it occurs? Or do you want a breakdown of all the terms in the index that start with bor by count?
The former sounds like a highlighting problem. The latter sounds like a terms aggregration with a prefix filter. I've used the latter for single term autocomplete before. It can work depending on the size of your index and term dictionary (ie number of unique terms).
I want a list of all the terms in the index starting with bor.
when i have these 6 datasets:
"blah is blubb boring"
"blah is blah"
"boring is border booooo"
"narf is border"
"border is boring"
"blah is border blaaaaaah"
and I search for "bor" I want to have boring: 3, border: 4 as result. My problem is especially how to get the full terms.
Late reply, but: thank! After some failures because of bad list definitions in my model it works now - and it's really fast, even with match_phrase_prefix. But I think, I'll test ngrams soon.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.