Hi all,
Im currently running an elasticsearch cluster thats storing and consuming
logs from logstash.
My architecture is:
- x3 m1.small logstash nodes, consuming from a local redis queue, indexing
to elasticsearch through the http_elasticsearch output - x5 m1.xlarge elasticsearch nodes with 10gb of heap assigned to the JVM
running elasticsearch
node stats:
node config:
I am running elasticsearch 0.90.1, only indexing at around 250 documents a
second, storing roughly 1billion documents, 1 index per day, retaining
documents for 30 days.
I haven't performed any optimisation apart from setting the field data
cache size to 40%.
I am seeing around 8gb of heap usage per node, and this is slowly rising
with the amount of documents i am indexing. I have kept adding nodes when
cluster nodes run out of heap and crash, I cant see where the heap is being
used in the node stats section, im guessing its some sort of caching? I now
want to tune the cluster to reduce ram usage.
Im planning to scale this cluster up x100, so i need to now heavily tune
it, im going to give this a go myself first, before i start paying for
support/a consultant.
My questions are:
- Can i realistically reduce heap usage in its current state?
- Do i want to change the architecture to the cluster some some nodes are
purely stores, whilst others are meant for searching etc? - Will changing fields to not analysed reduce memory usage?
- What kind of architecture and resources would a cluster of this size
consume?
Any help on this would be greatly appreciated.
cheers
Will
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.