We would like to improve the query response time of a search query on the cluster, you may ask what kind of query but I cannot share that information. The query does some full-text searches, then function_scores its findings basically.
I was wondering regardless of what our query might be, how can we improve solely search response time tweaking the cluster settings.
- there are 14 data nodes on the cluster, each has 128 GB RAM and 64 CPUs.
- 2800 shards distributed amongst the data nodes.
- Elasticsearch heap size is 30 GB on each machine.
- write and read are not separated on nodes.
- the index we are experimenting on has 200 primary shards and replica num is 1;
400 total shards.
- total size of the index is 12 TBs.
response time : 10 seconds
so far we tried, rolling over the index into 300 GB indices with 12 primary shards no replication.
response time : only 10 percent faster in average.
next we will try to index the data into groups (using the title field form indices from A-F in one index, from G -K into another etc.) will this improve the search speed?
What is the way to go after this to improve speed, it can be hardware improvements, index setting changes or anything. We are up for recommendations.
Without access to information about the version you are running, queries, data, mappings and load patterns it is hard to provide anything but general recommendations.
It does seem like each node holds a reasonably large amount of data - in excess of multiple TB per node. Given the large number of shards you will have a good level of concurrency when processing the query. As the data set even just for the index in question, let alone the full data volume, exceeds what can be held by the page cache it is not unlikely that you are limited by storage performance, so this is the first thing I would recommend checking. If you are seeing high iowait and contention I would recommend upgrading to fast local SSDs if you are not already using this. I would also recommend going through this guide.
I'm curious about the shard size of the index you are querying (trying to improve the read on). Can you share that info?
We are also constantly trying to tweak the performance in our setup. We have 25+ data nodes. Adding more data nodes seems to be the only valid solution in our setup so far. I have not been able to find a way to simply improve read without increasing cluster capacity.
Our shard size is between around 30 GBs each when its one whole index, after we rolled it over to many indices they were changing between 30 to 50 GBs.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.