I am having trouble searching with wildcards
when indexing data using
nGram
For example:
If I index data using the following index settings:
index:
analysis:
analyzer:
default:
type: standard
stopwords: _none_
I am able to search something like "b?t" and receive search results for *
"but"* and "bat" etc...
However, should I change the index settings to use nGram
for partial word
matching, like follows:
index:
analysis:
analyzer:
default:
type: custom
tokenizer: nGramTokenizer
filter: [lowercase,stopWordsFilter]
tokenizer:
nGramTokenizer:
type: nGram
min_gram: 1
max_gram: 2
stopWordsFilter:
type: stop
stopwords: _none_
Then partial word matching works, ie. searching "bu"will return "but"
and "bug" etc... but wildcards no longer seem to be supported...so if I
search "b?t" then there are no matching hits.
Is there a way that I can use wildcards
and nGram
together?
--
If this is not possible, is there a way to simulate a query as if we were
using the "?" wildcard character?
On Wednesday, 3 October 2012 15:14:06 UTC+2, My Head Hurts wrote:
I am having trouble searching with wildcards
when indexing data using
nGram
For example:
If I index data using the following index settings:
index:
analysis:
analyzer:
default:
type: standard
stopwords: _none_
I am able to search something like "b?t" and receive search results for
"but" and "bat" etc...
However, should I change the index settings to use nGram
for partial
word matching, like follows:
index:
analysis:
analyzer:
default:
type: custom
tokenizer: nGramTokenizer
filter: [lowercase,stopWordsFilter]
tokenizer:
nGramTokenizer:
type: nGram
min_gram: 1
max_gram: 2
stopWordsFilter:
type: stop
stopwords: _none_
Then partial word matching works, ie. searching "bu"will return "but"
and "bug" etc... but wildcards no longer seem to be supported...so if I
search "b?t" then there are no matching hits.
Is there a way that I can use wildcards
and nGram
together?
--
If this is not possible, is there a way to simulate a query as if we were
using the "?" wildcard character?
On Wednesday, 3 October 2012 15:14:06 UTC+2, My Head Hurts wrote:
I am having trouble searching with wildcards
when indexing data using
nGram
For example:
If I index data using the following index settings:
index:
analysis:
analyzer:
default:
type: standard
stopwords: _none_
I am able to search something like "b?t" and receive search results for
"but" and "bat" etc...
However, should I change the index settings to use nGram
for partial
word matching, like follows:
index:
analysis:
analyzer:
default:
type: custom
tokenizer: nGramTokenizer
filter: [lowercase,stopWordsFilter]
tokenizer:
nGramTokenizer:
type: nGram
min_gram: 1
max_gram: 2
stopWordsFilter:
type: stop
stopwords: _none_
Then partial word matching works, ie. searching "bu"will return "but"
and "bug" etc... but wildcards no longer seem to be supported...so if I
search "b?t" then there are no matching hits.
Is there a way that I can use wildcards
and nGram
together?
--
If this is not possible, is there a way to simulate a query as if we were
using the "?" wildcard character?
--