Has_child query performance


We just rolled out a new portion of our product that allows for searching by child documents. The response times on these queries are significantly lower than if only querying fields on the parent document.

I was wondering what are the quick win performance improvements we could make to our query to speed up these response times (sometimes as high as 2000-3000ms).

I've read about Global Ordinals (which we don't currently have enabled via our mapping) https://www.elastic.co/guide/en/elasticsearch/reference/2.3/mapping-parent-field.html#_global_ordinals.

What would the expected performance improvement be for that type of change? What else could we consider?

Here's an example of the query:

Enabling eager global ordinals loading should fix the spikes you're seeing. This makes sure that global ordinals are build during the refresh. Instead of during the first search request that sees that global ordinals are missing.

If you enable eager global ordinals loading, you should consider increasing the refresh interval, so that global ordinals have the re-build less frequent. The default refresh interval is 1 second, if your requirements allow then you can set it to 60 seconds or something like that.

@mvg based on your logic, after the first query, the global ordinal should be built, so all queries involving the same documents (e.g. executing the same exact query twice) should be significantly faster the second time.

However, despite attempting this particular query multiple times, we're still seeing really low performance. I'm hesitant to implement this change, reindex the entire cluster, only to find out it wasn't the root cause.

What level of performance gain should be expected? Are there any other things I should consider? What about disabling scoring entirely in that query? Does the child element need to have an id field and could that be causing performance problems?

Our mapping for the child element is as follows:

I don't see that you've enabled eager global ordinals loading in your mapping. This setting can be changed without a reindex, in fact in can be changed at runtime without any reload.

I would expect the spikes (3000 ms that you were reporting) to be gone when eager global ordinals have configured on the _parent field in your mapping.

Yes, if there are no changes to the index between the two requests. By default every second ES checks if there are changes and if so performs refresh (make sure that the latest changes are visible for search). If a refresh happens then global ordinals are invalidated and have to be rebuild. (either when requested by for example has_child query or eagerly as part of the refresh, which has the advantage that searches won't get a significant performance drop)

If we require refresh intervals in the 1 second range, and the number of documents in this index is ~130 million, is enabling global ordinals going to cause performance issues on refresh events?

Users expect changes made to immediately be searchable (1s is our definition of "immediate").

Thank you very much for your advice, this has been extremely helpful!

Yes, it will make refresh slower, but searches faster. I think the near realtime delay would be at least the same with eager global ordinals but likely to be worse compared to lazy global ordinals. (I think whether the cost is spend at a refresh or a search request is irrelevant to most of the users). Unless you completely denormalize your documents a 1 second near realtime delay isn't achievable with parent/child at scale.

Eager global ordinals is a form of warming, which is part of the refresh being executed. A shard (Lucene index) consists out of segments, changes are written into new segments and existing segments maybe merged into a new larger segment (previous segments are eventually removed). Changes (the new segments) aren't visible immediately via the search api, the refresh updates the view (current visible segments) the search api is using periodically on each shard independently and eager global ordinals loading is being triggered on this level.

In my opinion eager global ordinals loading is better than lazy global ordinals loading in the case where search requests with has_child or has_parent queries are executed frequently. If a Lucene index changes each second then when multiple search requests are executed in parallel each search request is likely to have its own view of the Lucene index and this causes global ordinals to load more often then once per refresh, which can increase the load significantly on the ES side. If searches are less frequent then it makes sense to keep the default (lazy global ordinals loading).

Also you can consider (if your data allows this) to split your data up in more indices based on certain properties in your data (for example user) or use routing (both at index and search time) with your current index strategy. With the main goal of making it less likely that changes effect all shards / indices.

Also adding a few of more shards in general also helps as the global ordinals on a shard level are smaller and can be executed more concurrently. However don't create too many shards as that can result in other problems. If more than a few shards need to be created then also add more nodes.

@mvg I think we would be ok with slower refreshes - I'm most concerned with experimenting with these values and causing the cluster to crash (trying to anticipate worst case scenario).

Query throughput is definitely significantly lower than indexing throughput as we have a large set of data that is changing all the time, since query throughput is a function of our currently active users.

By enabling eager global ordinals, does it have to rebuild the ordinals for the entire index (or shard?) each refresh, or only for the data that changes? If you can point me towards any documentation on how global ordinals work under the hood I might be able to have a slightly more fruitful conversation :slight_smile:

1: Preloading Fielddata | Elasticsearch: The Definitive Guide [2.x] | Elastic
2: Practical Considerations | Elasticsearch: The Definitive Guide [2.x] | Elastic

Global ordinals are re-build completely after each refresh for the entire shard.

@mvg thanks - this convo has been super helpful. How many shards would you recommend for an index with 140mil documents? Right now we have 6. Do you think we should increase that and reindex everything? Would that help keep the refreshes w/ eager global ordinal less expensive?

How many data nodes do you currently have in your cluster and what are the shard size in terms of disk usage and number of documents?

Maybe before such a change is made check how long it currently takes to load global ordinals. This can be seen in the logs if you set the index.warmer logging to trace.

Note that you don't have to restart nodes to change the log level: https://www.elastic.co/guide/en/elasticsearch/guide/2.x/logging.html

We currently have 140mil docs, 6 shards with total disk usage of 92gb (across the 6 shards). 15.3gb per shard.

2 data nodes, so total size across both nodes is ~180gb.

@mvg it currently takes ~15 seconds to do a refresh with eager loading of global ordinals. Would you expect this time to decrease if we were to increase the number of shards to 9 or 12?

@mvg How can I change this setting without reindex/reload?