Aggregations slow after inserts/updates

JF2018 · July 19, 2019, 8:55am

Hi

I have a query, where part of it is a terms-aggregation.
The query takes <500ms normally, but after inserts/updates to the index, the query takes
~2-3000ms the first couple of times .

If I remove the aggregation part of the query, I do not see the same performance problems after inserts/updates.

I am running on elastic 1.7, but can see people experience the same problems on other versions:

The solution used in one of the threads, using filter-aggregation, is not possible for me, since the scoring is key for the query.

The threads I can find are from 2016 and 2017, so maybe someone has experienced it in the meantime, and found an explanation/solution?

dadoonet · July 19, 2019, 9:18am

What kind of hardware do you have? Are you using doc_values?
Can you upgrade? We are now on 7.2 and so many things happened in the last 3 or 4 years... Including in the JVM itself.

JF2018 · July 19, 2019, 9:33am

Sadly I can not upgrade at the current time, even though it is one of my biggest wishes
We are using doc_values yes, as far as I understand aggregation is not possible without using doc_values?

We are running on instances with 4 virtual cores, 16GB ram with 7GB allocated to the JVM and 2TB discs

dadoonet · July 19, 2019, 9:35am

If you are not using doc_values (on disk) I think it's using fielddata then (in memory).
Are you using spinning disks?

JF2018 · July 19, 2019, 10:04am

Okay, in the newest versions (7.2) I think doc_values are necessary

All fields which support doc values have them enabled by default. If you are sure that you don’t need to sort or aggregate on a field, or access the field value from a script, you can disable doc values in order to save disk space:

But cant find it for the 1.7 version. But I guess doc-values should be the intended way to do it.

It is ssd, but over network, so a bit slower than if they were local

dadoonet · July 19, 2019, 10:20am

That's probably why it's slow. You should use local disks instead.

JF2018 · July 19, 2019, 10:38am

So the reason for it being slow, after data updates, should be that a cache is flushed, and thus it has to fill it again from disc, which is then slow because its over network?

dadoonet · July 19, 2019, 11:11am

When you update/add data, you are writing new segments on disk. Also if segment merges needs to happen, more data then have to be read on disk.
Which means that new search needs to read again new data from disk.

JF2018 · July 19, 2019, 11:15am

Makes good sense, I will try to change to a setup with local disks and test if that is enough to solve the problem.
Thank you very much, for the help!

dadoonet · July 19, 2019, 11:56am

And it's important that you think of upgrading.

JF2018 · July 24, 2019, 12:06pm

For future reference:

My problem was not that the disk wasn't local.
I tried to change to local disks, without any improvements.

The problem however seemed to be, that the field I do my aggregation on, has a very high cardinality (>1million).
Thus the global ordinals where taking a long time to recompute after data changes.
As default global ordinals are lazy loaded - that is on first search after changes.

By changing it to be eager-loaded, I pay on insert/refresh-time instead of search-time.
In my case this is a fine solution, since there are no requirements to the time for inserts/updates.

This seems to solve my problems.

global ordinals docs

system · August 21, 2019, 12:06pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Insert/Update operations makes queries slow? Elasticsearch	6	708	August 21, 2019
Slow aggregation queries while ingesting data Elasticsearch	2	562	December 16, 2017
Slow aggregation queries, only after data change (ES 2.3) Elasticsearch	9	1326	December 26, 2016
Elasticsearch 7.2 slow query after update Elasticsearch	15	5067	August 18, 2019
Elastic Search Aggregations Slow Elasticsearch	21	2705	November 26, 2021

Aggregations slow after inserts/updates

Related topics