We've noticed that a couple of our data nodes have been having memory
issues eventually leading to them dropping out of the cluster. Other nodes
in the cluster are nowhere near their limits in terms of memory usage.
Looking at the gc logs of those problem boxes it seems like it reaches a
point after some time ( about 3 days ) where the nodes start garbage
collecting every 30 seconds. We also see some out of memory errors in the
main log file and we think this may be due to when new logs are created at
midnight for the start of the new day - however we are still in the process
of confirming this.
I'm sure it's some configuration issues but not too sure where to even look
to figure out what needs tweaking. I'm going to describe our set up below
in the hopes that you may know how to help.
- Deploying to AWS
- Using Elasticsearch 1.1.0 and the equivalent aws-elasticsearch-plugin.
- Using Java 7 on centos6
- 6 Data nodes are m1.large
- 3 Master nodes m1.medium
- All nodes have ES_HEAP_SIZE = 5g
- All nodes have MAX_OPEN_FILES=65535
- 3 shards per index
- 1 replica
- ttl is set globally to 30 days.
- Flume is used to push log data into the cluster.
The big one is the number of indices. We are using it for our application
logging data. We have multiple applications and we write each application
log for each day into a separate index. So there will be new applications
that will fire up and send its logs to our cluster over time. It's similar
to the logstash set up, the difference being each application writes to
it's own index on a daily basis not one big global one on a daily basis if
that makes sense. Currently we have about 250 indices.
One of the things we have been considering is scaling up to deal with the
issue as we are on AWS. However we would like to understand how
elasticsearch distributes the load amongst the data nodes as during scaling
up we would like to distribute amongst the data nodes based on load, else
the scaling up may not have a significant effect.
Any help would be appreciated.
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to email@example.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/580e6a37-aa00-41c1-88cd-6f54865eccdb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.