Im currently running an elasticsearch cluster thats storing and consuming
logs from logstash.
My architecture is:
- x3 m1.small logstash nodes, consuming from a local redis queue, indexing
to elasticsearch through the http_elasticsearch output
- x5 m1.xlarge elasticsearch nodes with 10gb of heap assigned to the JVM
I am running elasticsearch 0.90.1, only indexing at around 250 documents a
second, storing roughly 1billion documents, 1 index per day, retaining
documents for 30 days.
I haven't performed any optimisation apart from setting the field data
cache size to 40%.
I am seeing around 8gb of heap usage per node, and this is slowly rising
with the amount of documents i am indexing. I have kept adding nodes when
cluster nodes run out of heap and crash, I cant see where the heap is being
used in the node stats section, im guessing its some sort of caching? I now
want to tune the cluster to reduce ram usage.
Im planning to scale this cluster up x100, so i need to now heavily tune
it, im going to give this a go myself first, before i start paying for
My questions are:
- Can i realistically reduce heap usage in its current state?
- Do i want to change the architecture to the cluster some some nodes are
purely stores, whilst others are meant for searching etc?
- Will changing fields to not analysed reduce memory usage?
- What kind of architecture and resources would a cluster of this size
Any help on this would be greatly appreciated.
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to email@example.com.
For more options, visit https://groups.google.com/groups/opt_out.