Aggregations in 2.1.0 much slower than 1.6.0

symos · January 7, 2016, 3:07pm

No, that's not it I'm afraid. We run the same query over and over and yes, results come much faster after the first run, but still much slower than in 1.7.4. When I say it takes X seconds on 2.1.1 and Y seconds on 1.7.4 I always mean after we have ran it several times.

An interesting difference I am noticing is that when running the query on 2.1.1, it creates 1GB worth of fielddata and 0MB worth of filter cache. When I run the same query on 1.7.4 it creates 1.7GB worth of fielddata and 450MB worth of filter cache. So maybe something has changed in the code since 1.7?

jpountz · January 7, 2016, 4:01pm

Could you share your request and try to capture hot threads after fielddata has been loaded already to see where CPU goes in that case?

Luke_Nezda · January 7, 2016, 7:58pm

Unfortunately I've since torn down that cluster, but we tested by running our most common agg-heavy query hundreds of times against each configuration and came to same conclusions as @symos.

symos · January 8, 2016, 9:52am

OK, how about this one:
https://dl.dropboxusercontent.com/u/23087609/hot_threads.zip

I've run the query and taken 10 "snapshots" of the hot threads every 1-2 seconds (the query takes around 17 seconds to finish). So this will give you a better idea of where the CPU goes.

Bear in mind the same query on version 1.x takes around 3.5 seconds.

I can also send you the request privately if you need it.

jpountz · January 14, 2016, 4:59pm

That would help thanks. Can you send it at adrien (at) elastic.co ?

jpountz · January 14, 2016, 5:12pm

Are you overriding the index.store.type setting by any chance? I'm surprised that it seems to use niofs while I would expect default_fs. I don't expect it to be the root cause of the problem, but it might contribute.

jpountz · January 14, 2016, 6:47pm

I may have found the reason: https://github.com/elastic/elasticsearch/pull/15998

Ivan · January 14, 2016, 9:35pm

Good catch.

symos · January 18, 2016, 10:15pm

No, we're not overriding index.store.type, we just checked.

I also sent you the request via email. It might help confirm if the bug you mention is indeed what is causing the problem.

jpountz · January 19, 2016, 10:14am

Hmm I could not find any email regarding this. Can you check that the email got actually sent?

jpountz · January 19, 2016, 10:15am

Oh nevermind, I just found it.

jpountz · January 19, 2016, 10:46am

Looking at the request and the hot threads, then https://github.com/elastic/elasticsearch/pull/15998 (which I already pasted above) should help resolve most of the slow down. This will be available in 2.2, which should be released in the coming weeks. If you still have performance issues when upgrading to 2.2 then I would be curious to get new hot threads to see what the new bottleneck is.

If you are willing to check as soon as possible, you could take a snapshot of your data and restore it in a nightly build of elasticsearch to compare performance. https://oss.sonatype.org/content/repositories/snapshots/org/elasticsearch/distribution/tar/elasticsearch/2.3.0-SNAPSHOT/elasticsearch-2.3.0-20160119.093021-82.tar.gz

symos · January 19, 2016, 11:06am

That's good to hear, let's hope that this will indeed solve the issue!

As for testing, unfortunately we can't do it right now, since we already reverted to 1.7.4 for our live setup and we'll leave it there for now as we have to deal with other parts of the migration. Our new staging server is not even live yet, so it will be a while before our new setup is fully functional and we're able to test.

So right now it looks that we will wait for 2.2 to be released and we will upgrade our staging server first to test. I will report back if the issue persists.

Thanks very much for your help and I'm glad we helped identify a problem!

jpountz · January 19, 2016, 2:47pm

Please do! Thanks for helping track down this problem.

Topic		Replies	Views
Terrible aggregation performance after migration from 1.7.6 to 2.4.4 Elasticsearch	2	481	March 31, 2017
Elasticsearch disk usage 1.x vs 2.x Elasticsearch	3	624	July 5, 2017
Indexing performance with doc values (particularly with larger number of fields) Elasticsearch	2	575	July 6, 2017
Elasticsearch 2.0 2.5X Disk Space Elasticsearch	4	1172	July 5, 2017
[Resolved] Elasticsearch 2.1.0 indexing 40% slower than 1.7.0 Elasticsearch	14	2518	July 5, 2017

Aggregations in 2.1.0 much slower than 1.6.0

Related topics