is there a recommended storage size per heap for a node at all?
for example; if you have 16GB heap would 5tb of data for that node be too much? or if you have a node with 32GB for the heap, can it handle 10tb of data?
i couldnt find anything in the documentation and searching google has produced no results. hopefully someone here can point me in the right direction.
Because it all depends.
What sort of data? What sort of queries? How many EPS/QPS? What version of ES? What version of the JVM? What do your mappings look like? How many shards? What sort of infrastructure?
currently i have 7 data nodes, ~32GB heap, 10TB of storage
as for indexes there are a few different types. all of the documents are small for each index, some are larger than others (depending on the source). the one i'm mostly concerned about;
daily average size is ~300GB with 275M documents - this will be the most frequently queried index (doc values is set). mappings on this one are quite large -- lots of fields and types, but small documents, 10 shards, 1 replica
other indexes vary quite wildly, but nothing like that
index #2 - imported once per quarter (from static csv dataset), average size of 100GB, 95M documents, 5 shards, 2 replicas
the 3rd index i'm currently building, so no real stats there that i can use
i got carried away there; forgot to add, elasticsearch 1.7.5, latest oracle java 8
in terms of EPS, currently sits at an index rate of about 6K according to marvel, however thats about to get a dramatic increase over the next few months
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.