I have an Elasticsearch index having about 93 million documents (68.8GB) on 3 shards. While doing load testing (300 concurrent requests) with one certain query, the CPU usage shoots up to 100%. The query is fairly simple:
{"query":
{"bool":
{"must": [
{"match": {"release_title": "Spider - Singles Collection 1976-86"}},
{"match": {"track_title": "TALKIN BOUT ROCK N ROLL (New Version)"}}
]
}
}
}
However, if I make the same type of query with different strings, the CPU usage is normal.
Here is the link to the output of Hot Threads (I'm a beginner in ES, not sure how to interpret this): https://pastebin.com/XTs6vvUF
The system has 14 cores and 40GB of RAM. I've set the JVM heap size to 16GB. Could someone help me figure out what's going wrong?
Is this query hitting a lot of documents? Is this query different to other queries that you are using? How long is this query executing? Also, you can use the Profile API to see where the time is spent.
Note that by default a match uses an OR to combine terms, so you may just hit a lot of documents that require to be scored.
Yes, it's hitting a lot of documents (about 1.9 million).
The structure of the query is identical to the other queries I'm using, it's only the query string values that change. But I've noticed that the good queries (the ones which don't cause the CPU usage spike) hit a fewer number of documents (<0.1 million).
The query actually takes as low as only 40ms to execute, but while load testing (300 concurrent requests with the same query), it's taking an average of 2s and goes up to even 10s.
Ah, I didn't realize that was the default behavior of match. I've modified the query with AND and fuzziness (which is the intended behavior) and will keep you updated with the new results. Thank you!
{
"query":{
"bool":{
"should":[
{
"match":{
"release_title":{
"query":"Spider - Singles Collection 1976-86",
"operator":"and",
"fuzziness":"AUTO"
}
}
},
{
"match":{
"track_title":{
"query":"TALKIN BOUT ROCK N ROLL (New Version)",
"operator":"and",
"fuzziness":"AUTO"
}
}
}
]
}
}
}
This is giving me much better results (average of ~900ms), but I wish to bring it further down to ~300ms or less. If I remove the fuzziness parameter, I'm getting an average of ~40ms. However, I do need fuzziness to account for misspellings and other slight variations in the strings. Is there anything else I can do speed up the query?
fuzziness is only one solution to this problem - a solution that happens at query time (thus the slowdown). You may want to take a look at the phonetic analysis plugin.
Another idea might be to run a non fuzzy query by default and only run the fuzzy one, if the first one does not return any hits (even though this case might be even slower for the zero hits case).
Also, maybe you do not need fuzziness as you are not interested in typos in single terms, but it might be enough it only a certain percentage of those terms are found. This way you could check the minimum_should_match parameter of the match query.
As you can see this is a super broad question and usually evolves a lot around the data and the concrete use-case, so it's super hard to come up with the one answer, but maybe you have a couple of more options to explore.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.