Hi everyone,
We are doing migration for ElasticSearch version 0.90 to version 7.10.1. I know this is really big upgrade for a component.
While this migration we realized, V7 queries are slower than V0 ones.
This slow performance can be observed at version 7 even for simple queries.
Simple query that performs better at V0:
-
v0 query:
{ "query": { "bool": { "must": [ { "term": { "some_term_value": 1234 } } ] } }, "from": 0, "size": 10 }
-
v7 query:
{ "query": { "bool": { "must": [ { "term": { "some_term_value": 1234 } } ] } }, "from": 0, "size": 10 }
-
some_term_value is mapped as
long
.
Machine specs;
- Both: 64GB machine RAM, 24GB ElasticSearch JVM RAM
- Both: has same CPU type and SSD disks.
- Both: standalone machines.
- Both: using JDK in the machine.
- V0: 1.8.0_232
- V7: 11.0.5
JVM options;
## JVM configuration
-Djava.io.tmpdir=...
-Xms24g
-Xmx24g
-Djava.net.preferIPv4Stack=true
## G1GC Configuration
11-:-XX:+UseG1GC
11-:-XX:G1ReservePercent=25
11-:-XX:InitiatingHeapOccupancyPercent=30
# use old-style file permissions on JDK9
-Djdk.io.permissionsUseCanonicalPath=true
## GC logging
9-:-Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m
ElasticSearch Specs:
- v7 using G1GC for garbage collection.
- Before this option we tried concurrent-mark-sweep but g1gc was way better for performance and cpu usage.
- request_caching is disable. Cause of unnecessary option for our data actuality.
- Our index is updated regulary by bulk inserts/deletes with interval of 500ms. This options is same for v0 and v7.
- 900+ mapping property.
In our environment setups these queries;
- took for v0 ~20ms - ~30ms
- took for v7 ~40ms - ~50ms
Our most consuming queries as sample:
-
v0 query:
{ "from": "0", "size": "15", "fields": ["some_field"], "query": { "filtered": { "query": { "match_all": {} }, "filter": [ { "terms": { "some_term_value": [ 1234 ] } }, { ... } ] } }, "filter": { ... }, "sort": [ ... ] }
-
v7 query:
{ "from": "0", "size": "15", "_source": ["some_source"], "query": { "bool": { "must": [{ "match_all": {} }], "filter": [{ "terms": { "some_term_value": [ 1234 ] } }, { "bool": { ... } } ] } }, "aggs": { ... }, "post_filter": { "bool": { ... } }, "sort": [ ... ] }
My main question here, with these huge version change of the ElasticSearch that range of V7 and V0, why version 7 slower than version 0 ?
If that's the case, do you have any suggestion for improving our query performance?
Or we are doing some dirt in our setup.
Thanks for advice!
Have a Nice Day.