Global direction: Parent-Child: Grouping and Sorting

Hi all,

I have a question regarding grouping and sorting - I can give complete gist
examples, but would like to check first the overall perspective on this -
am I on the right track, do I miss something, what will be the future
direction on this.

My Usecase: I have a lot of press articles, some of them are very similar
in content. I have to provide a search interface, that groups these
duplets, while giving the user a lot of search possibilities on article
content / meta data.

I decided to create a parent child mapping (parent: group, child: article)
for this for the following reasons:

  • the grouping will change over time: new articles are added constantly,
    and I do not want to reindex a lot of stuff
  • articles have their own visibility restrictions, but should be indexed up
    front
  • I strive for simple pagination and do not want to collect groups without
    knowing how many child documents I have to fetch

My current search strategy has two stages
(1) search with a has_child query for groups
(2) resolve all children for the groups with a has_parent query

The problem is, that I need to sort the parents/groups (result of the
first has_child query) by values of the children (articles). As I
understand, this is currently not possible.

The only solution around is to wrap the has_child query with a function
score and use that score for the sorting. Something like (bold the relevant
parts):

curl -XGET 'http://localhost:9200/index/group/_search?pretty=1' -d '{
"query" : {
"has_child" : {
"query" : {
"function_score" : {
"query" : {
}
},
"functions" : [ {
"script_score" : {
"script" : "doc['article.publicationNameSort'].value"
}
} ],
"boost_mode" : "replace"
}
},
"child_type" : "article",
"score_type" : "max"
}
},
"sort" : [ {
"_score" : { }
} ]
}'

The problem in my use case is, that the sort often needs more than one
field
or even several string values to sort on. (com)pressing these to a
single double is not always possible.

My questions

(A) will there be sorting support for has_child queries in the (near)
future

There are different comments on this in the community. Is this easy (as
supported by lucene) or a very high hanging fruit?

*(B) is there an other way to achieve the grouping *

The grouping could be solved by doing by hand - getting child values with a
simple query, scanning results, gathering some type of 'parent/group' field
and returning the result when enough groups have been resolved. A nightmare
regarding pagination. This looks a lot look the problems Elasticsearch
already has solved in parent-child queries / top-children query.

All other comments and suggestions are very appreciated.

Best regards, Wolfgang

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/45239716-7962-4272-9d9e-1a3b811460b4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.