Alternative approaches to query - Part II


(James Cook) #1

Suppose I am indexing a collection of movie subtitle translations, where
each document is a translation with properties to show the language and the
movie it represents. Some subtitles exist where the locale might be "fr_CA"
or "fr_FR" to indicate the difference between French Canada and French
European. A search term of 'fr' needs to match both 'fr_CA' and 'fr_FR'.

My use case is I want to query for a top five list by locale. For example,
the five most recently added films with french subtitles.

A not-so-nice approach to the problem is performing a query to fetch a
larger number of subtitles than needed and aggregate the result set based on
the film id property. This isn't very good on several fronts.

curl -XPOST 'http://localhost:9200/movies/subtitles/_search' -d '
{
"from" : 0,
"size" : 10,
"query" : {
"term" : {
"locale": "fr"
}
}
}'

Creating a movie's index with a child collection of subtitles is another
approach, but assume the indexing is by subtitle document alone.

Is there some other techniques or projection capability to assist in these
types of aggregated queries?


(system) #2