Hi,
We have been facing high CPU and memory usage in ES cluster which fails to send logs to the dashboard.
Setup:
ES cluster 2 nodes
ES Version: 2.0.0
No of total indices:13
No of open indices: 3
Total size of open indices: 28GB
RAM:16GB Master
No of replicas 1:
No of shards:2
ES HEAP SIZE=8g
bootstrap.mlockall: true
indices.fielddata.cache.size: 20%
ES logs:
[2015-12-04 05:39:55,631][WARN ][monitor.jvm ] [elastic-node-1] [gc][young][565][1271] duration [2.3s], collections [1]/[2.6s], total [2.3s]/[3.1m], memory [5.9gb]->[6gb]/[7.9gb], all_pools {[young] [123.9kb]->[4.1kb]/[133.1mb]}{[survivor] [16.6mb]->[16.6mb]/[16.6mb]}{[old] [5.9gb]->[6gb]/[7.8gb]}
Fluentd logs:
2015-12-04 08:31:31 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2015-12-04 09:34:07 +0000 error_class="Fluent::ElasticsearchOutput::ConnectionFailure" error="Could not push logs to Elasticsearch after 2 retries. read timeout reached" plugin_id="object:3f43554332e5c"
free -m
total used free shared buffers cached
Mem: 5298 14596 702 0 101 3265
-/+ buffers/cache: 11229 4069
Swap: 0 0 0
What may be the reason for this load? What is the recommended hardware configurations, elasticsearch configurations to make this normal?