Hello,
We have an elastic search cluster storing logs with the following
configuration:
- One index per day, we keep only the last 15 days
- Each index is ~8Gb (slightly increasing)
- The cluster is composed of 2 nodes that are 2 VMs with 6GB of RAM each
- Nodes runs ES 0.20.2 and have the following config:
cluster.name: eslogcluster
node.name: "eslogcluster_2"
index.number_of_shards: 2
index.number_of_replicas: 1
path.data: /usr/local/share/elasticsearch/data
index.cache.field.max_size: 2500
index.cache.field.expire: 10m
index.cache.field.type: soft - JVM arguments are the following:
/usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java
-Delasticsearch-service
-Des.path.home=/usr/local/share/elasticsearch-0.20.2
-Xss256k
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+HeapDumpOnOutOfMemoryError
-Djava.awt.headless=true
-Xms3072m
-Xmx3072m
[...] - the health of the cluster:
"cluster_name" : "eslogcluster",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 2,
"active_primary_shards" : 30,
"active_shards" : 60,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
Our problem is that sometimes ES starts garbage collecting lot during a few
minutes. While it is happening, we cannot do any query.
We get logs like:
[2013-02-25 13:56:38,861][INFO ][monitor.jvm ]
[eslogcluster_1] [gc][ConcurrentMarkSweep][340948][4742] duration [5.1s],
collections [1]/[5.6s], total [5.1s]/[33.6m], memory
[2.9gb]->[2.7gb]/[2.9gb], all_pools {[Code Cache]
[9.3mb]->[9.3mb]/[48mb]}{[Par Eden Space]
[266.2mb]->[71.1mb]/[266.2mb]}{[Par Survivor Space]
[33.2mb]->[0b]/[33.2mb]}{[CMS Old Gen] [2.6gb]->[2.6gb]/[2.6gb]}{[CMS Perm
Gen] [32.8mb]->[32.8mb]/[166mb]}
[2013-02-25 13:56:59,043][INFO ][monitor.jvm ]
[eslogcluster_1] [gc][ConcurrentMarkSweep][340953][4746] duration [5.1s],
collections [1]/[5.6s], total [5.1s]/[33.9m], memory
[2.7gb]->[2.8gb]/[2.9gb], all_pools {[Code Cache]
[9.3mb]->[9.3mb]/[48mb]}{[Par Eden Space]
[128.5mb]->[144.8mb]/[266.2mb]}{[Par Survivor Space]
[0b]->[0b]/[33.2mb]}{[CMS Old Gen] [2.6gb]->[2.6gb]/[2.6gb]}{[CMS Perm Gen]
[32.8mb]->[32.8mb]/[166mb]}
[2013-02-25 13:57:05,711][INFO ][monitor.jvm ]
[eslogcluster_1] [gc][ConcurrentMarkSweep][340955][4747] duration [5.1s],
collections [1]/[5.6s], total [5.1s]/[34m], memory
[2.9gb]->[2.7gb]/[2.9gb], all_pools {[Code Cache]
[9.3mb]->[9.3mb]/[48mb]}{[Par Eden Space]
[266.2mb]->[126mb]/[266.2mb]}{[Par Survivor Space]
[33.2mb]->[0b]/[33.2mb]}{[CMS Old Gen] [2.6gb]->[2.6gb]/[2.6gb]}{[CMS Perm
Gen] [32.8mb]->[32.8mb]/[166mb]}
...
Do you know what could be done to improve this? Using "index.cache.field.*"
parameters helped quite a lot to make to problem less frequent but it keeps
happening. Must we add more RAM? Should we increase java heap size? (ES is
not alone on the machines, we set heap size to only 3G to be sure it does
not swap)
Thanks
--
Quentin Barbe
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.