Hi all,
I'm newbie in Elastic Search. I'm interesting in the language analyzers and
I would like to know how they work.
I tried analyze with the language analyzer for English and this is the
result:
curl -XGET ' localhost:9200/_analyze?analyzer=english' -d 'Testing language
analyzer of Elastic Search: English'
And the tokens I received from ES are:
test
languag
analyz
elast
search
english
Are those tokens the expected result? I think the tokens should be the
whole word, but there are some incomplete word (languag, analyz, elast)
Regards,
Sam
dadoonet
(David Pilato)
August 3, 2012, 7:32am
2
Language analyzers extract radical from words.
If you send languages, language, you will have the same result.
Analyzing, analyzers, analyze... will be considered as equals when analyzed.
HTH
David
--
Le 3 août 2012 à 07:01, Ngoc Vo ngoc.vo3103@gmail.com a écrit :
Hi all,
I'm newbie in Elastic Search. I'm interesting in the language analyzers and I would like to know how they work.
I tried analyze with the language analyzer for English and this is the result:
curl -XGET ' localhost:9200/_analyze?analyzer=english' -d 'Testing language analyzer of Elastic Search: English'
And the tokens I received from ES are:
test
languag
analyz
elast
search
english
Are those tokens the expected result? I think the tokens should be the whole word, but there are some incomplete word (languag, analyz, elast)
Regards,
Sam
Wikipedia has some general info on stemming that you may find somewhat helpful.
--
Shaun
On Friday, 3 August 2012 at 17:02, David Pilato wrote:
Language analyzers extract radical from words.
If you send languages, language, you will have the same result.
Analyzing, analyzers, analyze... will be considered as equals when analyzed.
HTH
David
--
Le 3 août 2012 à 07:01, Ngoc Vo ngoc.vo3103@gmail.com a écrit :
Hi all,
I'm newbie in Elastic Search. I'm interesting in the language analyzers and I would like to know how they work.
I tried analyze with the language analyzer for English and this is the result:
curl -XGET ' localhost:9200/_analyze?analyzer=english' -d 'Testing language analyzer of Elastic Search: English'
And the tokens I received from ES are:
test
languag
analyz
elast
search
english
Are those tokens the expected result? I think the tokens should be the whole word, but there are some incomplete word (languag, analyz, elast)
Regards,
Sam