Help understanding innerhit performance

I have a parent/child relationship that queries very performantly but when retrieving innerhits falls over. Since I'm not returning the source and elastic has already done this computation to determine if there's a hit why does it matter?

POST browse/_search?request_cache=false
{
  "_source": {
    "excludes": [],
    "includes": [
      "resolvedName",
      "ll",
      "genre"
    ]
  },
  "from": 0,
  "query": {
    "bool": {
      "filter": [
        {
          "geo_bounding_box": {
            "ll": {
              "bottom_right": [
                -73.9544677734375,
                40.70979201243495
              ],
              "top_left": [
                -73.9599609375,
                40.71395582628603
              ]
            }
          }
        },
        {
          "type": {
            "value": "venues"
          }
        }
      ],
      "should": {
        "has_child": {
          "inner_hits": {
            "_source": false,
            "size": 1
          },
          "query": {
            "bool": {
              "filter": [
                {
                  "term": {
                    "published": true
                  }
                }
              ]
            }
          },
          "type": "events"
        }
      }
    }
  },
  "size": 1500
}

Takes 200-300ms on my es instance. Dropping the inner_hits clause (but keeping the has_child filter) reduces query time to about 13ms(!)

In the provided code I'm trying to pull back all venues in an area and annotate if they have an "event" at them. I can hack around this by sniffing the matched_queries but that's a bit grim.

Thanks

Inner hits are slow indeed. They need to run the query again on specific documents to check which children matched.

It is true that Elasticsearch already computed this information, but at the same time, there could be matches and it would require a lot of memory to keep track of this information for all matches.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.