Elasticsearch version:2.3.3
Plugins installed: []
JVM version:1.8_091
OS version:linux-3.10.101kernel
Description of the problem including expected versus actual behavior:
We deploy 12 es nodes in 3 machine. After we insert 1 billion record into es cluster and Meanwhile,we also send query request to es for search. After 3 day passed, es's jvm heap FGC frequently.the node's jvm gcutil looks like that: total 3.8GB for ES_HEAP, 2.6GB for jvm's old gen, and the old gen used is 100%.
We stop all write and read request,but the old gen used could not decrease. the old gen used is 100%.
we clear cache of es,but the old gen used only decrease to 96%.
Steps to reproduce:
1.use config see below
2.do insert and query
3.after data increase to 1 billion, heap used become 100%.
Provide logs (if relevant):
[2017-02-05 11:58:38,221][INFO ][monitor.jvm ] [xxx] [gc][old][1332853][268813] duration [5s], collections [1]/[5.1s], total [5s]/[17.2h], memory [3.8gb]->[2.8gb]/[3.8gb], all_pools {[young] [1gb]->[166.8mb]/[1gb]}{[survivor] [130.7mb]->[0b]/[136.5mb]}{[old] [2.6gb]->[2.6gb]/[2.6gb]}
[2017-02-05 12:04:39,450][INFO ][monitor.jvm ] [xxx] [gc][old][1333207][268917] duration [6.2s], collections [1]/[6.5s], total [6.2s]/[17.2h], memory [3.8gb]->[2.8gb]/[3.8gb], all_pools {[young] [1gb]->[165.2mb]/[1gb]}{[survivor] [111mb]->[0b]/[136.5mb]}{[old] [2.6gb]->[2.6gb]/[2.6gb]}
[2017-02-05 12:10:10,528][INFO ][monitor.jvm ] [xxx] [gc][old][1333531][269014] duration [6.1s], collections [1]/[6.7s], total [6.1s]/[17.2h], memory [3.8gb]->[2.8gb]/[3.8gb], all_pools {[young] [1gb]->[167.1mb]/[1gb]}{[survivor] [97.3mb]->[0b]/[136.5mb]}{[old] [2.6gb]->[2.6gb]/[2.6gb]}
[2017-02-05 12:33:23,673][INFO ][monitor.jvm ] [xxx] [gc][old][1334903][269423] duration [5.1s], collections [1]/[5.3s], total [5.1s]/[17.3h], memory [3.8gb]->[2.8gb]/[3.8gb], all_pools {[young] [1gb]->[168.3mb]/[1gb]}{[survivor] [119.5mb]->[0b]/[136.5mb]}{[old] [2.6gb]->[2.6gb]/[2.6gb]}
[2017-02-05 14:12:11,023][INFO ][monitor.jvm ] [xxx] [gc][old][1340737][271155] duration [5s], collections [1]/[5.8s], total [5s]/[17.4h], memory [3.8gb]->[2.8gb]/[3.8gb], all_pools {[young] [1gb]->[166.2mb]/[1gb]}{[survivor] [128.7mb]->[0b]/[136.5mb]}{[old] [2.6gb]->[2.6gb]/[2.6gb]}
Our elasticsearch.yml:
node.master: false
node.data: true
node.zone: findbugs_grp9
#
# Add custom attributes to the node:
#
# node.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
path.data: /data1/es_Data_2017-01-20_164005/es_0/data
#
# Path to log files:
#
path.logs: /data1/es_Data_2017-01-20_164005/es_0/logs
#
# ---------------------------------- indices -----------------------------------
index.number_of_shards: 12
index.number_of_replicas: 1
# index.refresh_interval: "10s"
# ----------------------------------- Threadpool -----------------------------------
threadpool.bulk.queue_size: 400
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
bootstrap.mlockall: true
#
# Make sure that the `ES_HEAP_SIZE` environment variable is set to about half the memory
# available on the system and that the owner of the process is allowed to use this limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: xxx
#
# Set a custom port for HTTP:
#
http.port: 9200
transport.tcp.port: 9300
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-network.html>
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["xxx:9309","100.65.8.219:9309","10.231.135.40:9309"]
# Prevent the "split brain" by configuring the majority of nodes (total number of nodes / 2 + 1):
#
discovery.zen.minimum_master_nodes: 2
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery.html>
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
gateway.recover_after_data_nodes: 4
#gateway.recover_after_nodes: 4
gateway.expected_data_nodes: 8
gateway.recover_after_time: 5m
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-gateway.html>
#
# ---------------------------------- Various -----------------------------------
#
# Disable starting multiple nodes on a single system:
#
# node.max_local_storage_nodes: 4
#
# Require explicit names when deleting indices:
#
# action.destructive_requires_name: true
#
# ---------------------------------- Cluster -----------------------------------
cluster.routing.allocation.awareness.attributes: zone
cluster.routing.allocation.awareness.force.zone.values: findbugs_grp9,findbugs_grp219,findbugs_grp40