I'm trying to understand how to scale an Elasticsearch cluster which is extremely slow right now!
Can anyone help?
It's approx. 124,000,000 documents totalling about 60GB.
I index the data once per month. In terms of usage, there are about 10-15 requests per second using a boolean query.
I've done my mappings, only index what needs searching and not use keywords for full-text only fields etc...
Hardware-wise, it's running across two nodes - each 8GB ram & 2 vCPUs. It's on an Elastic Cloud i/o cluster.
So here's the thing.. queries run much quicker on 1 shard (1 pri and 1 rep) than it does on say 3 or 5 shards, even though that 1 shard is 60GB. Can anyone tell me why?
Also I've noticed that CPU usage is through the roof but memory usage is relatively low..
Is the as a result of boolean queries?
Also with the Profile API | Elasticsearch Guide [7.13] | Elastic you could see where you are spending your time (has a UI in Kibana's Dev Tools). Is it fast enough with a single shard or do we need to dig deeper?
Blockquote Searches run on a single thread per shard
Most searches hit multiple shards. Each shard runs the search on a single CPU thread. While a shard can run multiple concurrent searches, searches across a large number of shards can deplete a node’s search thread pool. This can result in low throughput and slow search speeds.
I * think * what's happening is the searches are piling up before search requests complete. Without load, on Elastic Cloud i/o machines (2vCPUs) a multisearch of 5-7 items was taking 2000-3000 milliseconds. On the compute optimized (4vCPUs) I can get that down to 200-300ms.
Does that sound right? Or wildly wrong?
(I have also done a few things to help with timings like index:false for non-essential items and stripping out ~30million records.)
I am going to load test again and see what that brings.
Not sure this is really "a large number of shards", but it might just be a better setup for your hardware, documents, queries, and shard size / count. Also you're saving the the coordination overhead (to combine the subresults of the individual shards).
And if you're CPU-bound, more CPU definitely sounds like a good approach. If you're not actively writing to the index a forcemerge might also be helpful. After that it might be possible to find a more performant way to write your query, but that will totally depend on your current queries.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.