Came across some odd behavior. We have a query that performs in the tens of milliseconds until we go over 16 values in our terms filter. When 17 or more are included the performance degrades by 15-20 multiples.
Here is what the query looks like with values mocked for security:
{
"query": {
"bool": {
"filter": [
{
"terms": {
"problematicField.raw": [
"value-1",
"value-2",
"value-3",
"value-4",
"value-5",
"value-6",
"value-7",
"value-8",
"value-9",
"value-10",
"value-11",
"value-12",
"value-13",
"value-14",
"value-15",
"value-16",
"value-17"
]
}
}
]
}
}
}
The underlying field has a mapping that looks like this (there are other fields but for simplicity):
{
"properties" : {
"problematicField" : {
"type" : "text",
"fields" : {
"raw" : {
"type" : "keyword"
}
},
"analyzer" : "lowercase"
}
}
}
Here is what I found so far:
- the values of the field don't seem to matter, I can rearrange the first 16 and swap out the value of 17+ but as long as it is 16 or less the performance is tens of milliseconds
- as soon as 17+ fields are added to the terms filter the latency averages around 1500 ms
- 'problematicField' is a low cardinality field with a typically length of 3-15 characters
- we are querying 7 indices at a time with the same schema/mapping, there are two particular indices that I can remove and get it to perform in the tens of milliseconds again
- these indices are the two largest of the 7 with
pri.store.size
in the 2-4 TB range - these indices perform very well when the filter is 16 or less fields
- these indices perform very well for other queries including much more complex queries
- these indices are the two largest of the 7 with
- we are on Elasticsearch 8.6.2
Is there some data type or algorithmic threshold we cross when we get over 16 values, particularly on large indices? The value 16 being as important as it is in computers makes me wonder if this is a coincidence.
Has anyone seen anything like this?
Appreciate any help or insight you may have!