Terms Query performance with increased number of terms

I would like to understand how the performance of terms query is effected by the number of terms passed in to the terms query. According to very old blogs and Q&A terms around 1000 should not cause issue but what if the number of terms are comparatively much more.

Our current elastic version is 2.4 and I'm using lookup mechanism to filter results. When number of terms are around 5000 I can see significant change in query performance. The query starts to take around 800ms. Is this expected behaviour?

Are there any measures I can take to improve the query performance?

Since we'll be migrating to 7.x , should I expect better performance for the same lookup query? I'm asking this because according to latest documentation as quoted below, by default the limit for number of terms is 65536.

By default, Elasticsearch limits the terms query to a maximum of 65,536 terms. This includes terms fetched using terms lookup.

So should I expect that large number of terms will not slow down the query in latest version of elasticsearch?

I would say that it is. The more terms you pass in the more lookups and work Elasticsearch has to perform.

I would still expect increased number of terms to slow down the query. Whether performance for this type of query has improved or not I will need to leave for someone more familiar with the internals.

Probably, but I would spin up two parallel test clusters, ingest the same anount of data into both and simply test it.

Thanks for the response.

Yes that is definitely an option but there might be a reason that elasticsearch came up with 65536 as the default number of terms. I'm curious to know the reason for this.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.