nGrams and Wildcards


(MyHeadHurts) #1

I am having trouble searching with wildcards when indexing data using nGram?

For example:

If I index data using the following index settings:

index:
analysis:
analyzer:
default:
type: standard
stopwords: none

I am able to search something like "b?t" and receive search results for "but" and "bat" etc...

However, should I change the index settings to use nGram for partial word matching, like follows:

index:
analysis:
analyzer:
default:
type: custom
tokenizer: nGramTokenizer
filter: [lowercase,stopWordsFilter]
tokenizer:
nGramTokenizer:
type: nGram
min_gram: 1
max_gram: 2
stopWordsFilter:
type: stop
stopwords: none

Then partial word matching works, ie. searching "bu" will return "but" and "bug" etc... but wildcards no longer seem to be supported...so if I search "b?t" then there are no matching hits.

Is there a way that I can use wildcards and nGram together?


(David Pilato) #2

Can multifield feature help you?
http://www.elasticsearch.org/guide/reference/mapping/multi-field-type.html
http://www.elasticsearch.org/guide/reference/mapping/multi-field-type.html
You can index your field with different analyzers.

HTH
David

Le 3 octobre 2012 à 14:32, MyHeadHurts mathieson10@gmail.com a écrit :

I am having trouble searching with wildcards when indexing data using nGram?

For example:

If I index data using the following index settings:

/index:
analysis:
analyzer:
default:
type: standard
stopwords: none/

I am able to search something like "b?t" and receive search results for
"but" and "bat" etc...

However, should I change the index settings to use nGram for partial word
matching, like follows:

/index:
analysis:
analyzer:
default:
type: custom
tokenizer: nGramTokenizer
filter: [lowercase,stopWordsFilter]
tokenizer:
nGramTokenizer:
type: nGram
min_gram: 1
max_gram: 2
stopWordsFilter:
type: stop
stopwords: none/

Then partial word matching works, ie. searching "bu" will return "but"
and "bug" etc... but wildcards no longer seem to be supported...so if I
search "b?t" then there are no matching hits.

Is there a way that I can use wildcards and nGram together?

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/nGrams-and-Wildcards-tp4023453.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--

--
David Pilato
http://www.scrutmydocs.org/
http://dev.david.pilato.fr/
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

--


(system) #3