Hi All,
Still fairly new to Elasticsearch, but very impressed so far. Right now
I'm working on a place finder service that will access a repository of
place names. I'm attempting to build in some autocomplete functionality,
and while I've made significant progress, it's not perfect. My current
mapping on the given field for both index and search is based on the
following analyzer:
"analyzer_shingle" : { "tokenizer" : "standard", "filter" : [ "standard",
"lowercase", "filter_shingle"] }
where filter_shingle is defined as follows:
"filter_shingle" : { "type" : "shingle", "max_shingle_size" : 5,
"min_shingle_size" : 2, "output_unigrams" : "true }
I use this analyzer with a matchPhrasePrefixQuery, include a fuzziness of
0.8 and a maxExpansions of 30.
I also have a keyword analyzer which utilizes the matchPhrasePrefixQuery as
well, and is boosted so that fields that start with the entered value can
be boosted significantly
For the most part, this works great! I mean it really nails the search
every time and it's blazing fast.
So here's my issue, while this set up is working well, it fails if there
are any additional words after the phrase that aren't found in the actual
data. For instance, if I search for Goat, I get results like the following:
Goat
Goat Corral Flat
Goat Island
Goat Island Preserve Trail
Big Goat Road
Then if I search for "Goat Isla", I find a whole bunch of Goat Islands.
However, if I continue typing say, "Goat Island United States", the search
doesn't return any results. Now that bums me out for two reasons. On one
hand, this doesn't seem to make sense with the shingle filter, but maybe
i'm wrong. In my understanding, the shingle filter will make something
like the following tokens:
Goat
Goat Island
Goat Island United States
Island
Island United
United States
and so on and so forth...
Since all these tokens are passed into the search, and they are searching
on shingle tokenized data, then there should definitely be matches,
correct? "Goat Island" should still match some Goat Islands, and Island
should match a whole bunch of other things. Shouldn't I be finding data
here? Any thoughts on what I might be doing wrong. I would like to use
the United States part of the search in an additional query on another
field.
Thanks in advance for any help or direction!
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/12003ba2-6c52-4ec5-83f6-45926a1a6551%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.