I've been using the elasticsearch rpms (1.1.1) on our centos 6.5 setup and
I've been wondering about the recommended way to configure it given that it
deploys an init.d script with defaults.
I figured out that I can use /etc/sysconfig/elasticsearch for things like
heap size. However, /usr/share/elasticsearch/bin/elasticsearch.in.sh
configures some defaults for garbage collection:
So, I'm getting some default configuration for garbage collection that I
probably should be tuning; especially given that it is running out of
memory after a few weeks on our setup with kibana and a rather large amount
of logstash indices (over 200GB).
Is it possible to have a custom garbage collection strategy without
modifying files deployed and overwritten by the rpm? elasticsearch.in.sh
seems specific to the 1.1.1 version given that it also includes the
classpath definition.
In any case, it might be handy to clarify the recommended way to configure
elasticsearch when deployed using the rpm as opposed to a developer machine
with a tar ball. Most documentation I'm finding seems to assume the latter.
Any idea on why you're running out of memory? You can monitor stuff like
field, filter caches and memory pools to get some clues.
I would assume your problem is because field data is accumulating, and not
because of GC settings. Depending on how much heap, how many nodes you
have, and how much heap is used for other things, I'd limit that a slice of
the total memory (for example, 30%).
I've been using the elasticsearch rpms (1.1.1) on our centos 6.5 setup and
I've been wondering about the recommended way to configure it given that it
deploys an init.d script with defaults.
I figured out that I can use /etc/sysconfig/elasticsearch for things like
heap size. However, /usr/share/elasticsearch/bin/elasticsearch.in.shconfigures some defaults for garbage collection:
So, I'm getting some default configuration for garbage collection that I
probably should be tuning; especially given that it is running out of
memory after a few weeks on our setup with kibana and a rather large amount
of logstash indices (over 200GB).
Is it possible to have a custom garbage collection strategy without
modifying files deployed and overwritten by the rpm? elasticsearch.in.shseems specific to the 1.1.1 version given that it also includes the
classpath definition.
In any case, it might be handy to clarify the recommended way to configure
elasticsearch when deployed using the rpm as opposed to a developer machine
with a tar ball. Most documentation I'm finding seems to assume the latter.
Sounds like that could be the cause. What setting would I need to configure
for this? Regardless, I'd like to know where to start with configuring
garbage collection for ES.
Jilles
On Monday, April 28, 2014 8:58:03 AM UTC+2, Radu Gheorghe wrote:
Hi Jilles,
Any idea on why you're running out of memory? You can monitor stuff like
field, filter caches and memory pools to get some clues.
I would assume your problem is because field data is accumulating, and not
because of GC settings. Depending on how much heap, how many nodes you
have, and how much heap is used for other things, I'd limit that a slice of
the total memory (for example, 30%).
On Fri, Apr 25, 2014 at 7:09 PM, Jilles van Gurp <jilles...@gmail.com<javascript:>
wrote:
I've been using the elasticsearch rpms (1.1.1) on our centos 6.5 setup
and I've been wondering about the recommended way to configure it given
that it deploys an init.d script with defaults.
I figured out that I can use /etc/sysconfig/elasticsearch for things like
heap size. However, /usr/share/elasticsearch/bin/elasticsearch.in.shconfigures some defaults for garbage collection:
So, I'm getting some default configuration for garbage collection that I
probably should be tuning; especially given that it is running out of
memory after a few weeks on our setup with kibana and a rather large amount
of logstash indices (over 200GB).
Is it possible to have a custom garbage collection strategy without
modifying files deployed and overwritten by the rpm? elasticsearch.in.shseems specific to the 1.1.1 version given that it also includes the
classpath definition.
In any case, it might be handy to clarify the recommended way to
configure elasticsearch when deployed using the rpm as opposed to a
developer machine with a tar ball. Most documentation I'm finding seems to
assume the latter.
The setting is indices.fielddata.cache.size. You can check out the docs for
more options, like adjusting the circuit breaker:
To change the GC settings, I usually edit the in.sh script. You have an
interesting point, that it might be overridden by a RPM upgrade. I'm not
aware of a way to override them, maybe somebody else is.
Sounds like that could be the cause. What setting would I need to
configure for this? Regardless, I'd like to know where to start with
configuring garbage collection for ES.
Jilles
On Monday, April 28, 2014 8:58:03 AM UTC+2, Radu Gheorghe wrote:
Hi Jilles,
Any idea on why you're running out of memory? You can monitor stuff like
field, filter caches and memory pools to get some clues.
I would assume your problem is because field data is accumulating, and
not because of GC settings. Depending on how much heap, how many nodes you
have, and how much heap is used for other things, I'd limit that a slice of
the total memory (for example, 30%).
I've been using the elasticsearch rpms (1.1.1) on our centos 6.5 setup
and I've been wondering about the recommended way to configure it given
that it deploys an init.d script with defaults.
I figured out that I can use /etc/sysconfig/elasticsearch for things
like heap size. However, /usr/share/elasticsearch/bin/elasticsearc
h.in.sh configures some defaults for garbage collection:
So, I'm getting some default configuration for garbage collection that I
probably should be tuning; especially given that it is running out of
memory after a few weeks on our setup with kibana and a rather large amount
of logstash indices (over 200GB).
Is it possible to have a custom garbage collection strategy without
modifying files deployed and overwritten by the rpm? elasticsearch.in.shseems specific to the 1.1.1 version given that it also includes the
classpath definition.
In any case, it might be handy to clarify the recommended way to
configure elasticsearch when deployed using the rpm as opposed to a
developer machine with a tar ball. Most documentation I'm finding seems to
assume the latter.
Jilles
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
On Friday, 25 April 2014 18:09:28 UTC+2, Jilles van Gurp wrote:
I've been using the elasticsearch rpms (1.1.1) on our centos 6.5 setup and
I've been wondering about the recommended way to configure it given that it
deploys an init.d script with defaults.
I figured out that I can use /etc/sysconfig/elasticsearch for things like
heap size. However, /usr/share/elasticsearch/bin/elasticsearch.in.shconfigures some defaults for garbage collection:
So, I'm getting some default configuration for garbage collection that I
probably should be tuning; especially given that it is running out of
memory after a few weeks on our setup with kibana and a rather large amount
of logstash indices (over 200GB).
Is it possible to have a custom garbage collection strategy without
modifying files deployed and overwritten by the rpm? elasticsearch.in.shseems specific to the 1.1.1 version given that it also includes the
classpath definition.
In any case, it might be handy to clarify the recommended way to configure
elasticsearch when deployed using the rpm as opposed to a developer machine
with a tar ball. Most documentation I'm finding seems to assume the latter.
Jilles
The way I handle configuration with rpms is to create a special directory
with config files for each cluster and then start the cluster pointing out
the config file to be used with -Des.config. In that config file I point
out the directory to be used for config with path.conf so that logging.yml
can be found. I also use a special start script so that I can run ulimit
and set a few parameters before starting es.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.