Hi,
Does anyone know a way to use specific (localized) version of phonetic
analysers and use it through the existing plugin ? (For anyone wondering,
looking for something for french language...)
Thanks.
Yann
--
Hi,
Does anyone know a way to use specific (localized) version of phonetic
analysers and use it through the existing plugin ? (For anyone wondering,
looking for something for french language...)
Thanks.
Yann
--
The Beider-Morse phonetic analyzer was developed also for french and is
available in Lucene Core
http://stevemorse.org/phonetics/bmpm.htm
In Elasticsearch, the phonetic filter name is "beider_morse"
Best regards,
Jörg
On Wednesday, January 23, 2013 3:45:21 PM UTC+1, Yann Barraud wrote:
Hi,
Does anyone know a way to use specific (localized) version of phonetic
analysers and use it through the existing plugin ? (For anyone wondering,
looking for something for french language...)Thanks.
Yann
--
Hi Jörg,
I did not see this one. Double-metaphone seems to do the job also. Am I
wrong ?
I'll try both in the next few days hopefully...
Thanks !
Cordialement,
Yann Barraud
2013/1/24 Jörg Prante joergprante@gmail.com
The Beider-Morse phonetic analyzer was developed also for french and is
available in Lucene CoreBeider-Morse Phonetic Matching
In Elasticsearch, the phonetic filter name is "beider_morse"
Best regards,
Jörg
On Wednesday, January 23, 2013 3:45:21 PM UTC+1, Yann Barraud wrote:
Hi,
Does anyone know a way to use specific (localized) version of phonetic
analysers and use it through the existing plugin ? (For anyone wondering,
looking for something for french language...)Thanks.
Yann
--
--
If you check double metaphone, you can decide if it meets your requirements.
Note the development timeline of phonetic encodings
So I think Alexander Beider (Paris) must have done a good job in 2008
when he developed a family name matching algorithm.
Best regards,
Jörg
Am 25.01.13 11:07, schrieb Yann Barraud:
Hi Jörg,
I did not see this one. Double-metaphone seems to do the job also. Am
I wrong ?
I'll try both in the next few days hopefully...Thanks !
Cordialement,
Yann Barraud2013/1/24 Jörg Prante <joergprante@gmail.com
mailto:joergprante@gmail.com>The Beider-Morse phonetic analyzer was developed also for french and is available in Lucene Core http://stevemorse.org/phonetics/bmpm.htm In Elasticsearch, the phonetic filter name is "beider_morse" Best regards, Jörg On Wednesday, January 23, 2013 3:45:21 PM UTC+1, Yann Barraud wrote: Hi, Does anyone know a way to use specific (localized) version of phonetic analysers and use it through the existing plugin ? (For anyone wondering, looking for something for french language...) Thanks. Yann --
--
--
Mmmm... Makes (lots of) sense !!
Cordialement,
Yann Barraud
2013/1/25 Jörg Prante joergprante@gmail.com
If you check double metaphone, you can decide if it meets your
requirements.Note the development timeline of phonetic encodings
- Soundex, 1918 (start of names recognized, number codes)
- American Soundex, ~1930 (for american-english names, used by U.S. Census
Bureau)- Kölner Phonetik, 1970 (for german names)
- Daitch-Mokotoff, 1985 (for eastern european names)
- Metaphone, 1990 (improvements for variants in english names)
- Double Metaphone, 2000 (foreign pronounciation extension, start of names
recognized)- Beider-Morse, 2008 (pronounciation rules for identified languages, full
name recognized)So I think Alexander Beider (Paris) must have done a good job in 2008 when
he developed a family name matching algorithm.Best regards,
Jörg
Am 25.01.13 11:07, schrieb Yann Barraud:
Hi Jörg,
I did not see this one. Double-metaphone seems to do the job also. Am I
wrong ?
I'll try both in the next few days hopefully...Thanks !
Cordialement,
Yann Barraud2013/1/24 Jörg Prante <joergprante@gmail.com <mailto:
joergprante@gmail.com>**>The Beider-Morse phonetic analyzer was developed also for french and is available in Lucene Core http://stevemorse.org/**phonetics/bmpm.htm<http://stevemorse.org/phonetics/bmpm.htm> In Elasticsearch, the phonetic filter name is "beider_morse" Best regards, Jörg On Wednesday, January 23, 2013 3:45:21 PM UTC+1, Yann Barraud wrote: Hi, Does anyone know a way to use specific (localized) version of phonetic analysers and use it through the existing plugin ? (For anyone wondering, looking for something for french language...) Thanks. Yann --
--
--
--
Hi,
Can anyone tell me how to exploit the given filter ?
"query" : {
"bool": {
"must":
[{
"field":{
"prenom": {
"query":"yann"
}
}
},
{"field": {
"nom":{
"query":"rimbault"
}
}
},
{"field": {
"code_postal": {
"query":"75*"
}
}
}]
}
}
gives the correct answer (exact match), while
"query" : {
"bool": {
"must":
[{
"field":{
"prenom": {
"query":"yan"
}
}
},
{"field": {
"nom":{
"query":"rimbault"
}
}
},
{"field": {
"code_postal": {
"query":"75*"
}
}
}]
}
}
gives no answer.
Mapping is set to have beider-morse analyzer on fileds "nom" and "prenom"
Le vendredi 25 janvier 2013 11:30:23 UTC+1, Jörg Prante a écrit :
If you check double metaphone, you can decide if it meets your
requirements.Note the development timeline of phonetic encodings
- Soundex, 1918 (start of names recognized, number codes)
- American Soundex, ~1930 (for american-english names, used by U.S.
Census Bureau)- Kölner Phonetik, 1970 (for german names)
- Daitch-Mokotoff, 1985 (for eastern european names)
- Metaphone, 1990 (improvements for variants in english names)
- Double Metaphone, 2000 (foreign pronounciation extension, start of
names recognized)- Beider-Morse, 2008 (pronounciation rules for identified languages,
full name recognized)So I think Alexander Beider (Paris) must have done a good job in 2008
when he developed a family name matching algorithm.Best regards,
Jörg
Am 25.01.13 11:07, schrieb Yann Barraud:
Hi Jörg,
I did not see this one. Double-metaphone seems to do the job also. Am
I wrong ?
I'll try both in the next few days hopefully...Thanks !
Cordialement,
Yann Barraud2013/1/24 Jörg Prante <joerg...@gmail.com <javascript:>
<mailto:joerg...@gmail.com <javascript:>>>The Beider-Morse phonetic analyzer was developed also for french and is available in Lucene Core http://stevemorse.org/phonetics/bmpm.htm In Elasticsearch, the phonetic filter name is "beider_morse" Best regards, Jörg On Wednesday, January 23, 2013 3:45:21 PM UTC+1, Yann Barraud wrote: Hi, Does anyone know a way to use specific (localized) version of phonetic analysers and use it through the existing plugin ? (For anyone wondering, looking for something for french language...) Thanks. Yann --
--
--
Yes, the use is non-trivial. So I prepared an example how to use
Beider-Morse with Elasticsearch in a gist
Cordialement,
Jörg
Am 28.01.13 11:00, schrieb Yann Barraud:
Hi,
Can anyone tell me how to exploit the given filter ?
"query" : {
"bool": {
"must":
[{
"field":{
"prenom": {
"query":"yann"
}
}
},
{"field": {
"nom":{
"query":"rimbault"
}
}
},
{"field": {
"code_postal": {
"query":"75*"
}
}
}]
}
}
gives the correct answer (exact match), while
"query" : {
"bool": {
"must":
[{
"field":{
"prenom": {
"query":"yan"
}
}
},
{"field": {
"nom":{
"query":"rimbault"
}
}
},
{"field": {
"code_postal": {
"query":"75*"
}
}
}]
}
}
gives no answer.Mapping is set to have beider-morse analyzer on fileds "nom" and "prenom"
Le vendredi 25 janvier 2013 11:30:23 UTC+1, Jörg Prante a écrit :
If you check double metaphone, you can decide if it meets your requirements. Note the development timeline of phonetic encodings - Soundex, 1918 (start of names recognized, number codes) - American Soundex, ~1930 (for american-english names, used by U.S. Census Bureau) - Kölner Phonetik, 1970 (for german names) - Daitch-Mokotoff, 1985 (for eastern european names) - Metaphone, 1990 (improvements for variants in english names) - Double Metaphone, 2000 (foreign pronounciation extension, start of names recognized) - Beider-Morse, 2008 (pronounciation rules for identified languages, full name recognized) So I think Alexander Beider (Paris) must have done a good job in 2008 when he developed a family name matching algorithm. Best regards, Jörg Am 25.01.13 11:07, schrieb Yann Barraud: > Hi Jörg, > > I did not see this one. Double-metaphone seems to do the job also. Am > I wrong ? > I'll try both in the next few days hopefully... > > Thanks ! > > > Cordialement, > Yann Barraud > > > 2013/1/24 Jörg Prante <joerg...@gmail.com <javascript:> > <mailto:joerg...@gmail.com <javascript:>>> > > The Beider-Morse phonetic analyzer was developed also for french > and is available in Lucene Core > > http://stevemorse.org/phonetics/bmpm.htm <http://stevemorse.org/phonetics/bmpm.htm> > > In Elasticsearch, the phonetic filter name is "beider_morse" > > Best regards, > > Jörg > > > On Wednesday, January 23, 2013 3:45:21 PM UTC+1, Yann Barraud wrote: > > Hi, > > Does anyone know a way to use specific (localized) version of > phonetic analysers and use it through the existing plugin ? > (For anyone wondering, looking for something for french > language...) > > Thanks. > > Yann > > -- > > > > -- > >
--
Thnaks a lot !
What are the parts following used for ?
curl -XGET 'localhost:9200/test/_analyze?analyzer=phoneticAnalyzer&text=yann'
echo
echo "Query 1"
echo
Le lundi 28 janvier 2013 11:40:43 UTC+1, Jörg Prante a écrit :
Yes, the use is non-trivial. So I prepared an example how to use
Beider-Morse with Elasticsearch in a gistDemonstration of Beider-Morse phonetic filter with Elasticsearch · GitHub
Cordialement,
Jörg
Am 28.01.13 11:00, schrieb Yann Barraud:
Hi,
Can anyone tell me how to exploit the given filter ?
"query" : {
"bool": {
"must":
[{
"field":{
"prenom": {
"query":"yann"
}
}
},
{"field": {
"nom":{
"query":"rimbault"
}
}
},
{"field": {
"code_postal": {
"query":"75*"
}
}
}]
}
}
gives the correct answer (exact match), while
"query" : {
"bool": {
"must":
[{
"field":{
"prenom": {
"query":"yan"
}
}
},
{"field": {
"nom":{
"query":"rimbault"
}
}
},
{"field": {
"code_postal": {
"query":"75*"
}
}
}]
}
}
gives no answer.Mapping is set to have beider-morse analyzer on fileds "nom" and
"prenom"Le vendredi 25 janvier 2013 11:30:23 UTC+1, Jörg Prante a écrit :
If you check double metaphone, you can decide if it meets your requirements. Note the development timeline of phonetic encodings - Soundex, 1918 (start of names recognized, number codes) - American Soundex, ~1930 (for american-english names, used by U.S. Census Bureau) - Kölner Phonetik, 1970 (for german names) - Daitch-Mokotoff, 1985 (for eastern european names) - Metaphone, 1990 (improvements for variants in english names) - Double Metaphone, 2000 (foreign pronounciation extension, start of names recognized) - Beider-Morse, 2008 (pronounciation rules for identified languages, full name recognized) So I think Alexander Beider (Paris) must have done a good job in
2008
when he developed a family name matching algorithm. Best regards, Jörg Am 25.01.13 11:07, schrieb Yann Barraud: > Hi Jörg, > > I did not see this one. Double-metaphone seems to do the job also. Am > I wrong ? > I'll try both in the next few days hopefully... > > Thanks ! > > > Cordialement, > Yann Barraud > > > 2013/1/24 Jörg Prante <joerg...@gmail.com <javascript:> > <mailto:joerg...@gmail.com <javascript:>>> > > The Beider-Morse phonetic analyzer was developed also for french > and is available in Lucene Core > > http://stevemorse.org/phonetics/bmpm.htm <http://stevemorse.org/phonetics/bmpm.htm> > > In Elasticsearch, the phonetic filter name is "beider_morse" > > Best regards, > > Jörg > > > On Wednesday, January 23, 2013 3:45:21 PM UTC+1, Yann Barraud wrote: > > Hi, > > Does anyone know a way to use specific (localized) version of > phonetic analysers and use it through the existing plugin
?
> (For anyone wondering, looking for something for french > language...) > > Thanks. > > Yann > > -- > > > > -- > >
--
--
The bash script calls the _analyze and the _search API to demonstrate
the usage for the term 'yann' and 'yan'
Jörg
Am 28.01.13 11:57, schrieb Yann Barraud:
Thnaks a lot !
What are the parts following used for ?
--
Hello,
Works like a charm.
Have you any idea of the meaning of scores ? I get scores > 13 ? What does
it means ? I can't figure out why I get such scores, dans find no/few
documentation about how to interpret it...
Yann
Le lundi 28 janvier 2013 12:00:11 UTC+1, Jörg Prante a écrit :
The bash script calls the _analyze and the _search API to demonstrate
the usage for the term 'yann' and 'yan'Jörg
Am 28.01.13 11:57, schrieb Yann Barraud:
Thnaks a lot !
What are the parts following used for ?
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Don't worry. The scoring of docs is not absolute but relative to other
scores in the same result set in its meaning. What you see in the scores
are very short query terms matching very short words (phonetic codes) in
documents. Elasticsearch default scoring is like Lucene scoring, you can
find more information here Apache Lucene - Scoring
Jörg
Am 29.01.13 17:19, schrieb Yann Barraud:
Have you any idea of the meaning of scores ? I get scores > 13 ? What
does it means ? I can't figure out why I get such scores, dans find
no/few documentation about how to interpret it...
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.