I have a data model with parent-child relationship.
Parent type subject mapping:
{
"properties": {
"MRN": {
"type": "keyword"
},
"display_name": {
"type": "keyword"
},
"protocols": {
"type": "keyword"
},
"gender": {
"type": "keyword"
},
"dob": {
"type": "date"
}
....
}
Child type lab mapping:
{
"_parent": {
"type": "subject",
"eager_global_ordinals": false
},
"properties": {
"GUID": {
"type": "keyword"
},
"blinded": {
"type": "boolean"
},
"collect_time": {
"type": "date"
},
.....
}
Each lab document has only 1 parent, while a subject can have many lab documents.
I want to retrieve both parent and child document so the subject's gender, dob etc can be returned as search result.
Sample query:
{"bool": {
"must": [
{
"has_parent": {
"parent_type": "subject",
"inner_hits": {
"_source": {
"includes": ["MRN", "display_name", "gender", "race", "ethnic_group", "dob"]
}
},
"query": {
"bool": {
"must": [
{
"terms": {
"protocols": ["18785-BTRIS-TEST-01"]
}
}
]
}
}
}
},
{
"term": {
"_type": "lab"
}
},
{
"query_string": {
"default_field": "observation.name",
"query": "glucose"
}
}
]
}}
This query gives the expected results. However, it's quite slow. (Takes about 5 seconds on my test environment for size 1000). If I remove the "inner_hits" part, the same query run much faster (100~ milliseconds).
My question: how to retrieve both parent and child documents in a more efficient manner?