Finally, I tried with a new index mapping.
{
"mappings": {
"properties": {
"id": {
"type": "long"
},
"metadata": {
"properties": {
"film_group_id": {
"type": "long"
},
"production": {
"type": "text"
},
"duration": {
"type": "long"
},
"created": {
"type": "date"
}
}
},
"film_group": {
"type": "join",
"relations": {
"group": "film"
}
}
}
}
}
So when inserting the a document with a new film_group_id, I first insert a parent with this new film_group_id like so:
{
"id": 1,
"metadata": {
"film_group_id": 1
//other meta fields
},
"film_group": {
"name": "group"
}
}
And when inserting a new doc with existing film_group_id:
{
"id": 7,
"metadata": null,
"film_group": {
"name": "film",
"parent": 1 //parent id which is also film_group_id
}
}
Notice the metadata null value as the metadata is set on the parent insertion
Finally the query with the sort and the pagination:
{
"sort": {
"metadata.duration": "asc" //sort on a meta field
},
"from": 0, //pagination
"size": 1, //pagination
"query": {
"bool": {
"must": [
{
"has_child": {
"type": "film",
"query": {
"match_all" : {}
}
}
}
]
}
}
}
As only the parent documents have the metadata, I only need to look for the parent documents without any aggregation. I don't know how if the performance will follow along but for now, this solve my needs.
Anyway, thanks a lot for your help 