I have a query that is very fast (sub-second) without any inner_hits, but takes 20 - 30 seconds with inner_hits returned. Here is the query
{ "from": 0, "size": 2500, "terminate_after": 2500,
"query": {
"constant_score": {
"filter": { "bool": {"must": [
{"range": {"timestamp_utc": {
"gte": "2018-03-16 00:00:00",
"lt": "2018-03-23 00:00:00",
"format": "yyyy-MM-dd HH:mm:ss"}}},
{"nested": {
"path":"analytics",
"query":
{"terms": {"analytics.rp_entity_id": ["D8442A","4A6F00","FD9CFE","13FF12","B811D5","3D4567","228D42","C4EEAD","2D160B","DD3BB1","12E454","713810","352A3A","ECD263","7373D4","251988","E09E2B","C12ED9","0157B1","9768FE","D90F43","A4090F","69ADD9","42470E","4F9926","619882","7E3AFB","E5754F","598511","A18D3C","FF8CFC","9FEBFF","E5FA3A","90F0CE","CEC128","D6AAF0","1BC12C","1BC945","A6213D","267718","2F40E5","FE89E0","508CFD","AD9C5F","A5DD79","D71D85","ECDC73","0BB903","340280","A21964"] }}
/**/,
"inner_hits": {
"name": "analytics_hits_1",
"size": 1,
"_source": false
}
/**/
}}
]}}}},
"_source": false,
"sort": [{ "timestamp_utc": {"order": "desc"}}]
}
As you can see, I already have source disabled, and I also have size set to 1.
Running ES 6.3.2. I've already googled around and checked the forums. I also switched the compression codec from best_compression to default and this got me about a 20% boost in query performance, but still things are really slow here.
I have looked at the profiler output and it just tells me that the query is fast (and it is, if I remove inner_hits). It doesn't seem to tell me anything about the work required by inner_hits.
I know it will be hard to give me specifics as I can't give a reproducible test case here (as it probably requires lots of data in an index). But where should I look, and what parameters should I consider tweaking. Should I expect that including inner_hits will make the query response 20x slower?
Any hints, tips, and suggestions will be appreciated.
-Jason