ELK for 30k eps - memory problem

em01 · April 7, 2016, 2:16pm

Hi,
We are setting an elastic cluster for os logs purpose. We wish the system to collect 30k eps constant and store it for compliance.
We see that after days the system stops working, nodes become unavailable, JVM memory utilization hits 99% . In Elastic nodes we saw [FIELDDATA] New used memory 19873079695 [18.5gb] from field [@timestamp] would be larger than configurated breaker 19851234508 [18.4gb], breaking.
Our mem config was :
indices.fielddata.cache.size: 60%
indices.breaker.fielddata.limit: 80%
indices.cache.filter.size: 30%

As we are loggin time oriented data I expect problems with storing all timestamp in memory. We tried to configure breaker to lower value like 30 % but that does not give us much.

We set for this env 11 physical machines with 128 GB ram. A lot of storage in RAID6.
On 3 servers we have 1xmaster and 1x data node
On 8 server we run 2x data node.
Indexes have primary shards + 2 replicas.
In the system we have 54 TB of data in primary shards. (~160 TB replicated)

We run elastic 1.7.3
Recently one node hit 99% jvm and throw Java out of mem excepltion what caused the whole cluster to become unstable.

We bearly not search for any data. Just Marver is following the system state.
My question is:
Can we force system not to load @timestamp into RAM. Does it happen for every open index?
We mainly care for current index, as indexing process puts data in it. But we cannot close all the rest and this must be ready to possible searches.

Can we generally think about elastic for archiving purposes with the level of 30k eps ?
What could be done in order to limit memory utilisation? I think we followed all best practices for swapiness etc.

Appreciate Your help,

Igor_Motov · April 7, 2016, 3:03pm

Hi em01,

By default, your version of elasticsearch is building fielddata every time you access the timestamp field for aggregation or sorting, which happens pretty much every time you open any kibana dashboard or execute a search query that sorts by timestamp. Because building fielddata is an expensive process, elasticsearch caches it in memory. Please see https://www.elastic.co/guide/en/elasticsearch/guide/master/fielddata.html and https://www.elastic.co/guide/en/elasticsearch/guide/master/doc-values.html for more informaiton.

So, to avoid this problem, you can either give more memory to your elasticsearch cluster by adding more nodes or you can switch to doc values for the timestamp field. See Using Doc Values for some pointers on how to enable doc values on the logstash level.

Topic		Replies	Views
Many indices.fielddata.breaker errors in logs and cluster slow Elasticsearch	3	459	July 6, 2017
ELK suddenly colapsed Elasticsearch	13	2557	July 5, 2017
ES Breaker memory limit Elasticsearch	5	3826	July 5, 2017
Memory problems during data index Elasticsearch	13	1563	July 6, 2017
Lack of memory? Elasticsearch	11	805	July 6, 2017

ELK for 30k eps - memory problem

Related topics