I am using ES version 1.1.1 on a single node with the below settings:
2 shards per indices;
currently 395 indices;
currently 428GB;
18GB Heap commited on a 24 GB machine;
I used it only for indexing now, because I noted some memory issues when
performing search: elasticsearch became unresponsive...
My requirement curently is to use ElasticSearch only on a single machine
configuration.
I already tried to tune the JVM (Xmn to 2gb, noting Full Garbarge Collector
very often) whithout success.
Does anyone have some advices for me ?
Is the only way is to increase the heap (do you have a ratio RAM / data
size) ?
Does adding additional nodes on the same machine will help, why ?
Final question, I cannot see why in the indexing part so much of the RAM
is used, if anybody can explain ?
You are more than likely reaching the limits of the node.
Your options are to delete data, add more RAM (you should have 50% of
system for heap), close some old indexes or add nodes. Adding more nodes
spreads the shards of your indexes across the cluster which is essentially
spreading the load.
You could try disabling bloom filter cache as this will reduce your memory
usage a little, take a look Elasticsearch Curator which can do this for you.
I am using ES version 1.1.1 on a single node with the below settings:
2 shards per indices;
currently 395 indices;
currently 428GB;
18GB Heap commited on a 24 GB machine;
I used it only for indexing now, because I noted some memory issues when
performing search: elasticsearch became unresponsive...
My requirement curently is to use Elasticsearch only on a single machine
configuration.
I already tried to tune the JVM (Xmn to 2gb, noting Full Garbarge
Collector very often) whithout success.
Does anyone have some advices for me ?
Is the only way is to increase the heap (do you have a ratio RAM / data
size) ?
Does adding additional nodes on the same machine will help, why ?
Final question, I cannot see why in the indexing part so much of the
RAM is used, if anybody can explain ?
Thanks for your quick answer.
I cannot increase the RAM for ES, as I am already using 75% of the ram for
the JVM.
I will take a look at disabling the bloom filter cache to see if that
change anything.
Regarding the option of adding more nodes:
Do you have an idea of how many nodes are required to sustain that
quantity of data (428Gb), I will say 3 nodes one for each indice and an
extra node for replica, do you agree ?
In the hypothesis that I will not use in a first time replicas, do you
have an idea about the quantity of RAM needed for the two nodes (continuous
indexing and few search requests) ?
When it comes to capacity the answer is, it depends.
Given you're at around 430GB on a single node now, I'd add another node and
then see how things look at around the 8-900GB mark (spread across both).
Another clarification; The recommended operating procedure is to use half
your system RAM for heap, leaving the other half for OS caching, which
increases performance of ES. In your case under the best possible
circumstances, you should only really be using 12GB of your total 24GB.
Thanks for your quick answer.
I cannot increase the RAM for ES, as I am already using 75% of the ram for
the JVM.
I will take a look at disabling the bloom filter cache to see if that
change anything.
Regarding the option of adding more nodes:
Do you have an idea of how many nodes are required to sustain that
quantity of data (428Gb), I will say 3 nodes one for each indice and an
extra node for replica, do you agree ?
In the hypothesis that I will not use in a first time replicas, do you
have an idea about the quantity of RAM needed for the two nodes (continuous
indexing and few search requests) ?
You should tweak cache sizes. At least the field data cache needs to be
restricted (unbounded by default). Also, ensuring the various circuit
breakers are turned on will help. Another tip is to disable the _all field
if you don't need it.
All this should reduce the amount of memory ES uses and make it less likely
your cluster becomes unavailable. We use Elasticsearch with Kibana in our
production setup. Things definitely do not fail gracefully if you run short
of memory; so you need to prevent that situation. I've had a completely
unresponsive cluster on two occasions. With the current settings, it has
been running stable for several weeks now.
I've learned a few of these things the hard way. I think a ES tuning guide
for non experts is desperately needed. The out of the box experience is not
really appropriate for any serious production environment. But then, you
wouldn't run mysql with default settings in production either. In my
experience, you currently need to piece together bits of good advice spread
through the documentation and various forum posts. If you have an untuned
Elasticsearch in production there are several failure scenarios that are
likely to result in unavailability and data loss. Especially if you are
using ELK with lots of log data, you need to tune or you basically will
have a dead cluster in no time due to OOMs.
Jilles
On Monday, June 30, 2014 5:04:12 PM UTC+2, AlexK wrote:
Hi,
I am using ES version 1.1.1 on a single node with the below settings:
2 shards per indices;
currently 395 indices;
currently 428GB;
18GB Heap commited on a 24 GB machine;
I used it only for indexing now, because I noted some memory issues when
performing search: elasticsearch became unresponsive...
My requirement curently is to use Elasticsearch only on a single machine
configuration.
I already tried to tune the JVM (Xmn to 2gb, noting Full Garbarge
Collector very often) whithout success.
Does anyone have some advices for me ?
Is the only way is to increase the heap (do you have a ratio RAM / data
size) ?
Does adding additional nodes on the same machine will help, why ?
Final question, I cannot see why in the indexing part so much of the
RAM is used, if anybody can explain ?
I have just decrease the Java Jvm memory to 50% (12G), I will see if that
helps.
@Jilles:
I am using the defaut Logstash template and I things that by default the
_all field is disable...Ah no that not the case I will correct this
settings but why by default this field is enabled ?
I will set a limit for the field cache. But regarding the circuit
breaker, what are these ?
I totally agree that it would be good to have a deep tuning guide on ES
Setting up the JVM memory to 50% (12G) did not ease the problem as I
noticed GC collection up to 3min
Will really need to add a bunch of RAM to my machine..
With ES on a single machine, "tuning" does not cure the symptoms in the
long run. ES was designed to scale out on many nodes, so the simplest path
is to add nodes.
In a restricted environment, you could try to disable features that consume
a fairly amount of resources: disable _source and _all field, bloom codec
filter cache, reduce complex analyzers in the mapping, reduce number of
shards, reduce filter caches ...
I am using ES version 1.1.1 on a single node with the below settings:
2 shards per indices;
currently 395 indices;
currently 428GB;
18GB Heap commited on a 24 GB machine;
I used it only for indexing now, because I noted some memory issues when
performing search: elasticsearch became unresponsive...
My requirement curently is to use Elasticsearch only on a single machine
configuration.
I already tried to tune the JVM (Xmn to 2gb, noting Full Garbarge
Collector very often) whithout success.
Does anyone have some advices for me ?
Is the only way is to increase the heap (do you have a ratio RAM / data
size) ?
Does adding additional nodes on the same machine will help, why ?
Final question, I cannot see why in the indexing part so much of the
RAM is used, if anybody can explain ?
The idea behind the restriction of a single machine was for instance to
install ELK on a machine and perform fast indexing and review of a set of
log. What I got wrong is that the log size can be important (hundreds of
Gb) so this architecture will not work, according to the answers
above...(goal was to replace Splunk in a similar set up)
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.