Hi,
we were noticing some odd query runtime behaviour, while investigating ways to improve our ES queries. We would like to ask you about some insights about that.
Given the following mapping:
{
"settings": { ... },
"mappings": {
"properties": {
"ptIds": {
"type": "keyword"
}
}
}
}
We indexed 1 000 000 documents like:
{
"ptIds": [1049, 1325, 812, ... ]
}
The array contains only numbers.
The range of the numbers is between 1 and 2000.
The size of the array is between 1 and 150.
We now queried our index with a terms
query like:
{
"size": 60,
"query": {
"terms": {
"ptIds": [ ... ]
}
}
}
The following image shows the behaviour of the query runtime, when increasing the number of Ids in the query. Everything is fine until 16 Ids. But if we use the terms
query with 17 Ids, the runtime makes a sudden jump.
The image was created with Elasticsearch 7.7.1, on our local machine with docker.
But our production cluster (v7.7.1, 26 Nodes, Index: shard 6 * 3, docs 921 590 140, size 201.74GB) shows the same characteristic.
Elasticseach 7.15.1 shows the same characteristic and runtimes.
Elasticseach 8.0.0-alpha2 shows the same characteristic and but worse runtimes. A term query with 70 Ids takes about 400ms.
Can some please shed some light on this behaviour?
How can we improve this query?