Are there any limiting factors when picking the amount of memory to give to my elasticsearch nodes?? What I mean is, are there any metrics that would impose a minimum "hard limit" on the amount of memory given to machines in my cluster, which if I went under, I could EXPECT nodes to crash from running out of memory.
For example, if I have indices in the 90gb range with 3 shards each, that means that my shards are 30gb each. Does this have any implications on machine size?? Should I have at least 30gb of memory? Or does that not matter as much?
My goal is to allocate as small of machines as possible, and I'm just trying to figure out what number NOT to go under to avoid expected crashes.
this is a hard to answer question, because there are a lot of factors coming into play here. First, you dont need to load all your data into memory (even though having a lot of memory that can make use of the file system cache, which give a great boost). The main question is, what data structures will be in the memory of the elasticsearch process. Do you do a lot of deep and nested aggregations, that need some memory with every request? Will you do a lot of highlighting that might need either CPU or disk space?
Monitoring your existing cluster with its existing workload and checking things like current garbage collection, response times on queries, indexing throughput should give you an indication, if a setup hits it's SLA (the one you define) or not.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.