I have 30 node elasticsearch cluster (3 master and 27 data) with 400gb of primary data with 2 replicas. with 27 shards, Each shard is 15gb of primary data, Right now we are getting 900 searches per second and we are expecting it to grow to 2 million per hour. How do we scale our cluster to serve all our search requests with minimum latency? What would be the best practice to scale on search.
One fairly straightforward way to add extra capacity for searches is to increase the number of replica shards.
@whatgeorgemade Thank you responding.
How many replicas you suggest? But more replicas will slow down indexing? Increasing more shards and nodes will help?
I've just had another look at your numbers, and there may be a mistake because you'll be getting less searches per hour if the numbers in your post are correct. 900 searches per second = 3.24 million per hour (
Assuming there's a typo in your numbers and your search traffic will certainly increase, the first thing to figure out is whether or not you actually need more capacity. This depends on how your cluster is behaving with the current search traffic and current hardware. Are any hardware resources (CPU, memory, disk, etc) already stretched?
Indexing speed shouldn't change much if you add more replicas. This is because once the data is indexed on the primary shard, it's indexed on all replica shards in parallel.
It's impossible for me to say how many replicas to add because it depends on too many variables.
For now, I'd suggest monitoring your cluster closely to see how it's managing with the current workload. Deal with any existing problems before planning for future growth.
If you have a paid subscription with Elastic, it'd be worth getting in touch with your support rep, too.
Hope this helps.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.