Slower query_string query in elasticsearch 7.5 as compared to elasticsearch 2.4

We have recently started migrating from elasticsearch 2.4.1 to elasticsearch 7.5.2 but have faced a major roadblock because of query performance. A query string query that used to respond in around 800ms when queried in ES2 now takes around 2600ms when queried in ES7. Sample query:

GET myindex/_search
{
  "query": {
    "query_string": {
      "query": "((word1 AND word2) OR word3 OR word4 OR word5 OR (word6 AND word1) OR (word6 AND word1 AND word6) OR (word1 AND word7) OR word8 OR word9 OR word10) AND \"word1 word6\"",
      "fields": [
        "BKT04",
        "BKT05",
        "BKT06",
        "BKT07"
      ]
    }
  }
}

There is scope of improvement in query formation and we are working on it but why the same query is performing way better in ES2 than ES7?
Can the change in default scoring algorithm introduced in ES7 be an issue?
Any change in elastic/lucence way of implementation of query_string query?
Has anyone faced this kind of performance issue?

Cluster info:
Primary Shards : 2
Replica: 2
Nodes: 3
Size: 20000000 docs (60gb without considering replica)
Heap: 24g

Also note that indexing is done only once at off peak hours and cluster majorly have search traffic only.

Anyone, any directions/suggestions?

Have you taken a look at the profile API to figure out where most of the time is spent?

Also the explain output might help to see if the same query gets created in both cases.

Also, are the two queries and the mapping exactly the same? Please take a look if there are differences that might explain such a behaviour as well.

Also I suppose you have monitoring in place, that helps you to take a look at memory in use/garbage collections/queries per second happening, etc.. anything that shows within your monitoring?

Also, when testing, did you leave some time for warming up before taking thos performance numbers? Caching logic has slightly changed (along with a lot of other things across 4 major versions...)

Hope that helps as a start.

Thanks for replying.

I have zeroed down using profile api and came to the conclusion that the query string part is what is responding slower in ES 7.

Yes they are same.

Don't see any spikes here on production env.

I even setup and isolated single node cluster of both ES7 and ES2. No traffic on them and cache disabled. Config shared in OP is for the isolated clusters. Still see the drastic performance difference in the two version.

I don't understand that part as part of using profile: true in the search request? The query string query is parsed into several different queries like term queries in the end. I wanted to make sure that those final queries are the same in both versions.

Tried with profile: true there is difference in how the final query look like. One change discovered is that in version 2.4 query on an integer field is parsed to term query whereas in version 7.5 its parsed to range query.

To solve this we tried changing data type of field to keyword as we won't be doing range query on this field and eventually found that changing datatype from integer to keyword resulted in better performance.

But my doubt is why a query on an integer field is parsed as range query even when the range is not explicitly specified in the query. The query is like field: 1234. As you can see range is not specified. Shouldn't this be parsed as term query?

Another thing is how to figure out, how many such query parsing changes are introduced which would be effecting the search performance negatively?

That indeed was a change (I do not remember exactly when, would need to look it up) how numeric fields where stored and queried and explain your improved search performance, when using keywords instead.

Is the performance now on par or are there other differences?

There is still a difference. Although we seems to be inching towards the performance we have when using version 2.4.

Finally we are able to achieve the similar performance as it was in version 2.4 by changing the datatype of most of the integer fields to keyword.
Learning: Avoid integer data type unless you want range query on a field. Keeping the datatype as keyword in much better w.r.t. search performance.