The ultimate answer is the 42
of IT aka It depends
. Let's break this down and see, which factors come into play.
It's rather hard to give exact advice here, as many factors come into play
- Your indexing load, how many documents per second/minute are indexing
- Your total index & data size
- Your query load - how many queries per second are you executing?
- Your query complexity - are you doing a simple full text search, or are you running complex and deeply nested aggregations with hundreds of buckets per requests
- Your storage strategy - is all of your data in your hot tier and needs to be queried with shortest response times
- Your indexing complexity: Do you have a complex mapping that requires CPU intensive analysis of your strings?
- Your replication strategy: How many copies of your data do you really need within your running cluster?
Knowing none of these things, giving a number for scaling would be one of two things:
- A lie, resulting in an underperforming system
- Such a low number, that independent from the above factors your system would work, resulting in a underutilized system
Both is something, that neither we as a provider of Elasticsearch nor you as the user accept.
Now a couple of strategies. First, get somewhat more familiar with sizing and do a sizing exercise with the data you already got
- https://www.elastic.co/webinars/elasticsearch-sizing-and-capacity-planning
- https://www.elastic.co/webinars/elasticsearch-scaling-best-practices
- https://www.elastic.co/blog/sizing-hot-warm-architectures-for-logging-and-metrics-in-the-elasticsearch-service-on-elastic-cloud
The first part is the theory, the second part is more about practice. You need to figure out correct sizing with your own data. This is where rally comes in - a macrobenchmarking framework for Elasticsearch, allowing to test with your own data.
The third part is about monitoring: If you use Elastic Cloud, you do have the ability to also configure monitoring of your cluster, so you can see how much memory is actually needed and how it increases over time, so that you need to scale up/out your cluster.