We have recently started migrating from elasticsearch 2.4.1 to elasticsearch 7.5.2 but have faced a major roadblock because of query performance. A query string query that used to respond in around 800ms when queried in ES2 now takes around 2600ms when queried in ES7. Sample query:
GET myindex/_search
{
"query": {
"query_string": {
"query": "((word1 AND word2) OR word3 OR word4 OR word5 OR (word6 AND word1) OR (word6 AND word1 AND word6) OR (word1 AND word7) OR word8 OR word9 OR word10) AND \"word1 word6\"",
"fields": [
"BKT04",
"BKT05",
"BKT06",
"BKT07"
]
}
}
}
There is scope of improvement in query formation and we are working on it but why the same query is performing way better in ES2 than ES7?
Can the change in default scoring algorithm introduced in ES7 be an issue?
Any change in elastic/lucence way of implementation of query_string query?
Has anyone faced this kind of performance issue?
Have you taken a look at the profile API to figure out where most of the time is spent?
Also the explain output might help to see if the same query gets created in both cases.
Also, are the two queries and the mapping exactly the same? Please take a look if there are differences that might explain such a behaviour as well.
Also I suppose you have monitoring in place, that helps you to take a look at memory in use/garbage collections/queries per second happening, etc.. anything that shows within your monitoring?
Also, when testing, did you leave some time for warming up before taking thos performance numbers? Caching logic has slightly changed (along with a lot of other things across 4 major versions...)
I have zeroed down using profile api and came to the conclusion that the query string part is what is responding slower in ES 7.
Yes they are same.
Don't see any spikes here on production env.
I even setup and isolated single node cluster of both ES7 and ES2. No traffic on them and cache disabled. Config shared in OP is for the isolated clusters. Still see the drastic performance difference in the two version.
I don't understand that part as part of using profile: true in the search request? The query string query is parsed into several different queries like term queries in the end. I wanted to make sure that those final queries are the same in both versions.
Tried with profile: true there is difference in how the final query look like. One change discovered is that in version 2.4 query on an integer field is parsed to term query whereas in version 7.5 its parsed to range query.
To solve this we tried changing data type of field to keyword as we won't be doing range query on this field and eventually found that changing datatype from integer to keyword resulted in better performance.
But my doubt is why a query on an integer field is parsed as range query even when the range is not explicitly specified in the query. The query is like field: 1234. As you can see range is not specified. Shouldn't this be parsed as term query?
Another thing is how to figure out, how many such query parsing changes are introduced which would be effecting the search performance negatively?
That indeed was a change (I do not remember exactly when, would need to look it up) how numeric fields where stored and queried and explain your improved search performance, when using keywords instead.
Is the performance now on par or are there other differences?
Finally we are able to achieve the similar performance as it was in version 2.4 by changing the datatype of most of the integer fields to keyword.
Learning: Avoid integer data type unless you want range query on a field. Keeping the datatype as keyword in much better w.r.t. search performance.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.