We have an elasticsearch cluster running in production with the below
configuration. Our cluster has a master-data node topology with 8 master
nodes serving traffic. The issue we are facing very recently is that
garbage collector is running more frequently and also taking more time as
well(than usual) for the past few days. Total garbage collection on all
master nodes takes a maximum of around 250s to complete. Search requests
have increased a bit of late and we are still using nio fs as the index
). This is affecting our latencies and ultimately requests get dropped once
the cluster is not able to handle the traffic. We have a queue on top of
incoming requests and is constantly at ~10. Most of our requests are either
- "number_of_nodes": 17,
- "number_of_data_nodes": 9,
- "active_primary_shards": 24730,
- "active_shards": 74196,
- Elasticsearch version: 0.90.11
- Java: 1.7u55
- Garbage collector: G1
We have one index per user and one shard+one replica for each index.
Current number of indices is 24730 and we expect it to go as much as 45000.
We created around 10k indices in the last month. And we just upgraded to
java 1.7u55 from 1.7u51.
Our master nodes have 64gb of RAM and elasticsearch is using half of it.
So how does search requests affect heap size and garbage collection? We do
not have any cross index search requests. What can be the possible reasons
for GC taking so much time in the past few days (even before the java
upgrade as well)? Can we use mmapfs so that less heap size gets used and
hence GC could run faster?
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firstname.lastname@example.org.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/438e067c-a50b-445e-ae3a-ac1c3a04bec4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.