Elasticsearch 5 vs 8 performance and index size

Hello. I've updated from elasticsearch 5.6 to 8.1 and following load tests I can see a decrease in performance, between 20-40%. I was expecting 8.1 to be faster 5.6, I'm assuming we're doing something wrong. Another weird thing is that the index size, with the exact same number of documents is 3-4x smaller. Do you have any suggestions on performance? Is the index size reduction normal?

elasticsearch 5.6 is EOL and no longer supported. Please upgrade ASAP.

(This is an automated response from your friendly Elastic bot. Please report this post if you have any suggestions or concerns :elasticheart: )

What is the size, structure and complexity of your documents? What does the mappings look like for the different versions? What type of queries are you running? What is the size of the index? What is the shard size?

Speaking on behalf of Idorasi_Paul.
So our index has about 35,000 Json documents, mostly non nested fields.
So for both (ES5 and ES8) clusters, we have the same config, 3 primary shards and 33 replica shards.
The size of the index is about 37Gb in the ES5 cluster and about 12Gb in ES8 cluster.
The shard size is about 1Gb per shard in ES5 and about 370-380Mb in the ES8 cluster.
We can't get our heads around the fact that how is size so different for the same piece of data?
Also the mappings are exactly the same for both the 2 indices as well.
We are using templates to query (both the mappings and query templates are a little too complex to be shared over here) but they are exactly same for both the clusters.

You have jumped up 3 major versions, and a lot has changed in Elasticsearch during that time. It is therefore hard to say what changes could be affecting you without looking at mappings and queries and thereby determining which features and query types you are relting on to narrow the field of possibilities.

Although I have not used Elasticsearch 5 in many years I do believe it was the last version that hasd the _all field enabled by default. This resulted in larger indices as all data was indexed twice, but I would not expect the difference to be nearly as large as you describe, so it is unlikely the only factor impacting the difference in size. If you are running query clauses without specifying fields it could however affect query performance. This is what the _all field was used for, and since its removal all fields have to be queried individually, which can be slower if you have large documents with a lot of fields.

Are all queries giving the same response/results?

Are all queries affected the same or are there some patterns in which ones are slower than others?

You are jumping 3 versions, Elastic has made storage improvements on 6.X, 7.X and 8.X, it is very hard to track what was changed, but I see no issue with this reduction in the storage usage.

The big difference was concerning us because it was too big, but if you say it's normal, we are good.

We can't really determin a pattern and results look a little different. The result issue is next in line. We have added some extra logic to the mustache queries because we found some issues with commas. Could extra if else mustache clauses cause such a difference in performance?

Edit: Removed the extra if else mustache clauses, no change.

A few different things here:

  1. Regarding the size difference, as long as all your data/fields are being indexed as expected, I'd most likely say this change is part of the major version jumps.
  2. We can't really help with performance without a few more pieces of information:
    1. Example mappings on the index
    2. Example queries being run that are showing issues
  3. Take a look at the Profile API it should provide a way of seeing what a query is spending its time on and therefore allow for better tuning of it.

Side Note: While I know you just upgraded to 8.1, consider looking at upgrading to a newer version of 8.x, there have already been significant improvements to the 8.x series since 8.1. It appears that 8.6.x is a relatively stable release as the .1 patch doesn't have much and the .2 patch (which appears to be coming soon) doesn't have much either.

1 Like

I'm not sure how many details I'm allowed to share with you regarding queries and mappings. I'll play around with profile api for both es5 and es8 and come back with details. Thanks!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.