Can I predict how much memory do I need?


I am sorry if my question is silly, I am actually a chemical engineer that want to use Elastic Search on my own mobile app.

I want to make deployment using Elastic Cloud. there are storage and memory, I can predict how many GB storage based on how many documents that I have. but how about the memory ? how do I know if let say 1GB memory is enough or not for my project ?

I assume this 1 GB RAM will relate to the number of search requests that can be handled. am I right ? if 10k users using search at the same time will it hit this 1GB limit ? can I predict this before making deployment ?

or if let say I have already use 2GB memory, and a lot of users have already used my app, and more and more users are using my application. how do I know if this memory is still enough ? is there any indicator ?


that is not a silly question at all! Understanding ones requirement for sizing is an ever on going quest :slight_smile:

The ultimate answer is the 42 of IT aka It depends. Let's break this down and see, which factors come into play.

It's rather hard to give exact advice here, as many factors come into play

  • Your index load, how many documents per second/minute are indexing
  • Your total index & data size
  • Your query load - how many queries per second are you executing?
  • Your query complexity - are you doing a simple full text search, or are you running complex and deeply nested aggregations with hundreds of buckets per requests
  • Your storage strategy - is all of your data in your hot tier and needs to be queried with shortest response times
  • Your indexing complexity: Do you have a complex mapping that requires CPU intensive analysis of your strings?
  • Your replication strategy: How many copies of your data do you really need within your running cluster?

Knowing none of these things, giving a number for scaling would be one of two things:

  1. A lie, resulting in an underperforming system
  2. Such a low number, that independent from the above factors your system would work, resulting in a underutilized system

Both is something, that neither we as a provider of Elasticsearch nor you as the user accept.

Now a couple of strategies. First, get somewhat more familiar with sizing and do a sizing exercise with the data you already got

The first part is the theory, the second part is more about practice. You need to figure out correct sizing with your own data. This is where rally comes in - a macrobenchmarking framework for Elasticsearch, allowing to test with your own data.

The third part is about monitoring: If you use Elastic Cloud, you do have the ability to also configure monitoring of your cluster, so you can see how much memory is actually needed and how it increases over time, so that you need to scale up/out your cluster.

Hope this helps as a start.


1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.