Garbage collector taking more time with increase in search requests

Hi,

We have an elasticsearch cluster running in production with the below
configuration. Our cluster has a master-data node topology with 8 master
nodes serving traffic. The issue we are facing very recently is that
garbage collector is running more frequently and also taking more time as
well(than usual) for the past few days. Total garbage collection on all
master nodes takes a maximum of around 250s to complete. Search requests
have increased a bit of late and we are still using nio fs as the index
store
type(http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/index-modules-store.html
). This is affecting our latencies and ultimately requests get dropped once
the cluster is not able to handle the traffic. We have a queue on top of
incoming requests and is constantly at ~10. Most of our requests are either
get/bulkUpdate/search.

Configuration:

  • "number_of_nodes": 17,
  • "number_of_data_nodes": 9,
  • "active_primary_shards": 24730,
  • "active_shards": 74196,
  • Elasticsearch version: 0.90.11
  • Java: 1.7u55
  • Garbage collector: G1

We have one index per user and one shard+one replica for each index.
Current number of indices is 24730 and we expect it to go as much as 45000.
We created around 10k indices in the last month. And we just upgraded to
java 1.7u55 from 1.7u51.

Our master nodes have 64gb of RAM and elasticsearch is using half of it.

So how does search requests affect heap size and garbage collection? We do
not have any cross index search requests. What can be the possible reasons
for GC taking so much time in the past few days (even before the java
upgrade as well)? Can we use mmapfs so that less heap size gets used and
hence GC could run faster?

Thanks,

Raj

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/438e067c-a50b-445e-ae3a-ac1c3a04bec4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

mmapfs is great on x64, and if you can upgrade to 1.X.

But you may just be reaching the limits of the cluster, therefore need more
nodes.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 19 April 2014 18:15, Rajkumar S rajkumarsceg@gmail.com wrote:

Hi,

We have an elasticsearch cluster running in production with the below
configuration. Our cluster has a master-data node topology with 8 master
nodes serving traffic. The issue we are facing very recently is that
garbage collector is running more frequently and also taking more time as
well(than usual) for the past few days. Total garbage collection on all
master nodes takes a maximum of around 250s to complete. Search requests
have increased a bit of late and we are still using nio fs as the index
store type(
Elasticsearch Platform — Find real-time answers at scale | Elastic). This is affecting our latencies and ultimately requests get dropped once
the cluster is not able to handle the traffic. We have a queue on top of
incoming requests and is constantly at ~10. Most of our requests are either
get/bulkUpdate/search.

Configuration:

  • "number_of_nodes": 17,
  • "number_of_data_nodes": 9,
  • "active_primary_shards": 24730,
  • "active_shards": 74196,
  • Elasticsearch version: 0.90.11
  • Java: 1.7u55
  • Garbage collector: G1

We have one index per user and one shard+one replica for each index.
Current number of indices is 24730 and we expect it to go as much as 45000.
We created around 10k indices in the last month. And we just upgraded to
java 1.7u55 from 1.7u51.

Our master nodes have 64gb of RAM and elasticsearch is using half of it.

So how does search requests affect heap size and garbage collection? We do
not have any cross index search requests. What can be the possible reasons
for GC taking so much time in the past few days (even before the java
upgrade as well)? Can we use mmapfs so that less heap size gets used and
hence GC could run faster?

Thanks,

Raj

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/438e067c-a50b-445e-ae3a-ac1c3a04bec4%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/438e067c-a50b-445e-ae3a-ac1c3a04bec4%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624b3evMsO7BRLv34-jsh%3D%3DZ4_EiVXt1MJnPHrvFw816Aqw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.