Hi,
What's the best way to handle an index with multilingual docs (e.g., 1
lang per doc, but not all docs in the same language)?
Is explicitly specifying the language-specific analyzer via _analyzer
for each field at index time the best approach?
See http://www.elasticsearch.org/guide/reference/mapping/analyzer-field.html
Or would this work better:
curl -XPOST localhost:9200/test -d '{
"mappings" : {
"english" : {
"properties" : {
"title" : { "type" : "string", "index" : "analyzed" },
"pubDate" : { "type" : "date", "index" : "analyzed" }
}
},
"french" : {
"properties" : {
"title" : { "type" : "string", "index" : "analyzed" },
"pubDate" : { "type" : "date", "index" : "analyzed" }
}
}
}
}'
Or maybe there is some other option I'm missing?
Thanks,
Otis
Sematext is hiring world-wide -- http://sematext.com/about/jobs.html