Hey
I gist my configuration and query.
gistfile1.js
{
"index" : {
"analysis" : {
"filter" : {
"my_ngram" : {
"max_gram" : 20,
"min_gram" : 2,
"type" : "nGram"
},
"my_snow" : {
This file has been truncated. show original
The context
I use a stemmer-ngram filter to index the title field of my documents. (It
enables me to implement a super fast autocompletion btw).
I works great with french words containing é, è (i.e I can search
theorie to find documents indexed with
théorie ).
But it does not work with french words containing ^.
I index the title "Pôle web 2.0" which I cannot find using "pole" term.
It seems that the n-gram tokenizer does not recognize ^ as an accent.
Any idea ?
-Alex-
No idea everybody ?
On 30 août, 12:17, Alexandre Heimburger alexheimbur...@gmail.com
wrote:
Hey
I gist my configuration and query.
Ngram indexation of french words containing ^ · GitHub
The context
I use a stemmer-ngram filter to index the title field of my documents. (It
enables me to implement a super fast autocompletion btw).
I works great with french words containing é, è (i.e I can search
theorie to find documents indexed with
théorie ).
But it does not work with french words containing ^.
I index the title "Pôle web 2.0" which I cannot find using "pole" term.
It seems that the n-gram tokenizer does not recognize ^ as an accent.
Any idea ?
-Alex-
Hi,
I don't think n-gram tokenizer strips any accents. You need to use
ASCII Folding Token Filter
(Elasticsearch Platform — Find real-time answers at scale | Elastic )
for this, at both index and query time analysis. I've altered your AC
analysis (added "asciifolding" filter at both index and query time
analysis), check stripping accents in auto-complete analysis · GitHub -> I've tested with
"query": "pole" and it matches.
Hope this helps,
Tomislav
2011/8/31 alheim alexheimburger@gmail.com :
No idea everybody ?
On 30 août, 12:17, Alexandre Heimburger alexheimbur...@gmail.com
wrote:
Hey
I gist my configuration and query.
Ngram indexation of french words containing ^ · GitHub
The context
I use a stemmer-ngram filter to index the title field of my documents. (It
enables me to implement a super fast autocompletion btw).
I works great with french words containing é, è (i.e I can search
theorie to find documents indexed with
théorie ).
But it does not work with french words containing ^.
I index the title "Pôle web 2.0" which I cannot find using "pole" term.
It seems that the n-gram tokenizer does not recognize ^ as an accent.
Any idea ?
-Alex-
Thanks a lot. I test tomorrow morning and I'll tell you.
On Wed, Aug 31, 2011 at 4:42 PM, Tomislav Poljak tpoljak@gmail.com wrote:
Hi,
I don't think n-gram tokenizer strips any accents. You need to use
ASCII Folding Token Filter
(
Elasticsearch Platform — Find real-time answers at scale | Elastic
)
for this, at both index and query time analysis. I've altered your AC
analysis (added "asciifolding" filter at both index and query time
analysis), check stripping accents in auto-complete analysis · GitHub -> I've tested with
"query": "pole" and it matches.
Hope this helps,
Tomislav
2011/8/31 alheim alexheimburger@gmail.com :
No idea everybody ?
On 30 août, 12:17, Alexandre Heimburger alexheimbur...@gmail.com
wrote:
Hey
I gist my configuration and query.
Ngram indexation of french words containing ^ · GitHub
The context
I use a stemmer-ngram filter to index the title field of my documents.
(It
enables me to implement a super fast autocompletion btw).
I works great with french words containing é, è (i.e I can search
theorie to find documents indexed with
théorie ).
But it does not work with french words containing ^.
I index the title "Pôle web 2.0" which I cannot find using "pole" term.
It seems that the n-gram tokenizer does not recognize ^ as an accent.
Any idea ?
-Alex-
--
Alexandre Heimburger
R&D Manager
blueKiwi Software
tel : +33687880997
email : ahb@bluekiwi-software.com
adress : 93 rue Vieille du Temple, 75003 Paris
What is blueKiwi? blueKiwi - the first Enterprise Social Software Suite in
the world building professional networks on conversations and relationships
helps large organizations increase their productivity, foster innovations
and boost people satisfaction.