Multilinqual queries in ElasticSearch

Let's say we have the following mapping in ElasticSearch.

{
"content": {
"properties": {
"id": {
"type": "string",
"index": "not_analyzed",
"store": "yes"
},
"locale_container": {
"type": "object",
"properties": {
"english": {
"type": "object",
"properties": {
"title": {
"type": "string",
"index_analyzer": "english",
"search_analyzer": "english",
"index": "analyzed",
"term_vector": "with_positions_offsets",
"store": "yes"
},
"text": {
"type": "string",
"index_analyzer": "english",
"search_analyzer": "english",
"index": "analyzed",
"term_vector": "with_positions_offsets",
"store": "yes"
}
}
},
"german": {
"type": "object",
"properties": {
"title": {
"type": "string",
"index_analyzer": "german",
"search_analyzer": "german",
"index": "analyzed",
"term_vector": "with_positions_offsets",
"store": "yes"
},
"text": {
"type": "string",
"index_analyzer": "german",
"search_analyzer": "german",
"index": "analyzed",
"term_vector": "with_positions_offsets",
"store": "yes"
}
}
},
"russian": {
"type": "object",
"properties": {
"title": {
"type": "string",
"index_analyzer": "russian",
"search_analyzer": "russian",
"index": "analyzed",
"term_vector": "with_positions_offsets",
"store": "yes"
},
"text": {
"type": "string",
"index_analyzer": "russian",
"search_analyzer": "russian",
"index": "analyzed",
"term_vector": "with_positions_offsets",
"store": "yes"
}
}
},
"italian": {
"type": "object",
"properties": {
"title": {
"type": "string",
"index_analyzer": "italian",
"search_analyzer": "italian",
"index": "analyzed",
"term_vector": "with_positions_offsets",
"store": "yes"
},
"text": {
"type": "string",
"index_analyzer": "italian",
"search_analyzer": "italian",
"index": "analyzed",
"term_vector": "with_positions_offsets",
"store": "yes"
}
}
}
}
}
}
}
}

When a particular user queries the index, we can take her culture from
her settings, i.e. we know which analyzer to use. How can we formulate
a query which will search only "title" and "text" fields in her own
language (let's say, German) and use German analyzer to tokenize the
search query?

What happens if you make search on content.locale_container.english.text
field or content.locale_container.german.text ?
Something like :

{
"query" : {
"term" : { "content.locale_container.english.text" : "my english
words" }
}
}

HTH
David

-----Message d'origine-----
De : elasticsearch@googlegroups.com [mailto:elasticsearch@googlegroups.com]
De la part de Pavel
Envoyé : mercredi 14 septembre 2011 05:03
À : elasticsearch
Objet : Multilinqual queries in ElasticSearch

Let's say we have the following mapping in ElasticSearch.

{
"content": {
"properties": {
"id": {
"type": "string",
"index": "not_analyzed",
"store": "yes"
},
"locale_container": {
"type": "object",
"properties": {
"english": {
"type": "object",
"properties": {
"title": {
"type": "string",
"index_analyzer": "english",
"search_analyzer": "english",
"index": "analyzed",
"term_vector": "with_positions_offsets",
"store": "yes"
},
"text": {
"type": "string",
"index_analyzer": "english",
"search_analyzer": "english",
"index": "analyzed",
"term_vector": "with_positions_offsets",
"store": "yes"
}
}
},
"german": {
"type": "object",
"properties": {
"title": {
"type": "string",
"index_analyzer": "german",
"search_analyzer": "german",
"index": "analyzed",
"term_vector": "with_positions_offsets",
"store": "yes"
},
"text": {
"type": "string",
"index_analyzer": "german",
"search_analyzer": "german",
"index": "analyzed",
"term_vector": "with_positions_offsets",
"store": "yes"
}
}
},
"russian": {
"type": "object",
"properties": {
"title": {
"type": "string",
"index_analyzer": "russian",
"search_analyzer": "russian",
"index": "analyzed",
"term_vector": "with_positions_offsets",
"store": "yes"
},
"text": {
"type": "string",
"index_analyzer": "russian",
"search_analyzer": "russian",
"index": "analyzed",
"term_vector": "with_positions_offsets",
"store": "yes"
}
}
},
"italian": {
"type": "object",
"properties": {
"title": {
"type": "string",
"index_analyzer": "italian",
"search_analyzer": "italian",
"index": "analyzed",
"term_vector": "with_positions_offsets",
"store": "yes"
},
"text": {
"type": "string",
"index_analyzer": "italian",
"search_analyzer": "italian",
"index": "analyzed",
"term_vector": "with_positions_offsets",
"store": "yes"
}
}
}
}
}
}
}
}

When a particular user queries the index, we can take her culture from
her settings, i.e. we know which analyzer to use. How can we formulate
a query which will search only "title" and "text" fields in her own
language (let's say, German) and use German analyzer to tokenize the
search query?

A term query (as shown above) would not use an analyzer (I think). You
probably need to go with the text query family:

{
"query" : {
"text" : { "content.locale_container.english.text" : "my english
words" }
}
}

Based on your mapping, ES would lookup the correct search analyzer for the
field 'content.locale_container.english.text' and use it to analyze your
query 'my english words'. In other words, the correct analyzer is identified
based on the field you query on (via the mapping definition).