Memory usage for ELK-style systems

Hi,

I'm using Elasticsearch with Kibana as a searchable/aggregatable event log, so a kind of ELK installation although no logstash. Recently we made the jump to Elasticsearch 0.20 all the way up to 1.5 and added Kibana. Things have been good. The memory profile less so for smaller machines. These machines also share memory with a cassandra installation, and both are memory-hungry.

Questions:

  1. Newer Elasticsearch uses mmapfs for some parts. How does this affect memory usage? Are there alternatives to this setting which use less memory if we can sacrifice search performance?
  2. How does index.number_of_replicas affect memory usage?

Changes we are evaluating, please comment on these:

a. ES_HEAP_SIZE tuning. It's at 25% of physical RAM right now to accomodate the neighbouring cassandra process which also is set to have a max heap of 25%. I think that is reasonable but we have to measure more.
b. Modifying term_index settings to: index.term_index_interval: 256 and index.term_index_divisor: 5
c. We have set indices.fielddata.cache.size to 40%, considering to lower it to 20%

Thanks!

The more replicas you have the more memory you will need.

However given you are running two memory hungry apps on the same server, you are playing with fire and we don't recommend doing this, for obvious reasons.

Thanks. We are aware of the issues with not having dedicated elasticsearch machines and agree.

The main point was that lowering replica count should lower memory requirements.

Regarding mmapfs, my understanding is that mmaped files does just change VIRT memory reported by 'top' which is anyhow tricky to interpret for a java process.

Lastly, indicies.fielddata.cache.size and index.termin_index_interval we will have to benchmark using our data and usage. Is there any HTTP API to get memory statistics from elasticsearch or do you only use JVM tools for that?

There's heaps of monitoring APIs.

I'd start with https://www.elastic.co/guide/en/elasticsearch/reference/current/cat.html