The use case I'm addressing right now is searching place hierarchies (that
could include place types as well). In my country, you can specify place
hierarchy in several ways. For instance:
"El corregimiento de Mulaló, jurisdicción del municipio de Yumbo (Valle del
Cauca)"
"El corregimiento de Mulaló, en jurisdicción del municipio de Yumbo del
Valle del Cauca"
"El corregimiento de Mulaló, ubicado en Yumbo, Valle del Cauca"
"El corregimiento de Mulaló, en Yumbo, Valle del Cauca"
"El corregimiento de Mulaló, en el municipio de Yumbo (Valle del Cauca)"
"El corregimiento de Mulaló - Yumbo, Valle del Cauca"
"Mulaló, Yumbo, Valle del Cauca"
"Mulaló, Municipio de Yumbo, en el Valle del Cauca"
"Corregimiento de Mulaló, Municipio de Yumbo, Departamento del Valle del
Cauca"
"Corregimiento de Mulaló, Municipio de Yumbo, Departamento de Valle del
Cauca"
"Corregimiento de Mulaló, Municipio de Yumbo, en el Valle del Cauca"
"Corregimiento de Mulaló, Municipio de Yumbo, en el Valle del Cauca"
...
All of those are equivalent.
I want to get rid of articles ("el", "la", "los", "las"), prepositions
("de", "del"), and other synonyms (e.g. "en" and "jurisdicción", "ubicado
en") so that I can compare analyzed queries with some pre-generated (few)
cases I can handle from my original JSON docs.
Thanks for the link, the only caveat I see is (of course) to figure out the
cutoff_frequency. Additionally, There are other very common words in my
index I wouldn't like to overlook. For instance, a place type such as
"municipio" (municipality) is the second level in the place hierarchy, so
it could appear in any other place from the third level down the hierarchy.
The sample data I mentioned above is a third level place.
2014-08-28 13:55 GMT-05:00 Itamar Syn-Hershko itamar@code972.com:
Elasticsearch Platform — Find real-time answers at scale | Elastic
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CANaz7mx0tqxJsdbHgw9JONUFLWDSW7zdvtA%3DA%2B-yUV%3DN69kXzg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.