I am running a PHP web application whose data layer is based out of 3
elasticsearch nodes.
Once in a while there might be an individual node failing (e.g. recently
one run into a JVM Heap OOM) but the cluster would still become green (2
nodes required) so I would like to make the application fault tollerant.
What is the best practice to avoid sending requests to the instance that
fails?
Would you implement healthchecks at the application layer?
What are you exactly doing? Are you indexing documents? Are you
searching for documents? Do you keep your queries in a logfile so you
can trace what is going on? Did you enable logging at GC level in
Elasticsearch? Have you a strategy for sizing your application, that is,
have you calculated in advance how much resources you will need?
Jörg
Am 02.04.13 18:03, schrieb foufos:
Dear elasticsearch users
I am running a PHP web application whose data layer is based out of 3
elasticsearch nodes.
Once in a while there might be an individual node failing (e.g.
recently one run into a JVM Heap OOM) but the cluster would still
become green (2 nodes required) so I would like to make the
application fault tollerant.
What is the best practice to avoid sending requests to the instance
that fails?
Would you implement healthchecks at the application layer?
We've considered using haproxy to loadbalance (round-robin) the REST calls
to the different nodes. haproxy can easily do a health check, by pinging
the machine or sending HTTP requests and checking the response.
In the end we ended up with an off-the-shelve-loadbalancer from our hosting
company. This seems to work just fine, but we haven't tested that ourselves.
Jaap
Jaap Taal
[ Q42 BV | tel 070 44523 42 | direct 070 44523 65 | http://q42.nl |
Waldorpstraat 17F, Den Haag | Vijzelstraat 72 unit 4.23, Amsterdam | KvK
30164662 ]
What are you exactly doing? Are you indexing documents? Are you searching
for documents? Do you keep your queries in a logfile so you can trace what
is going on? Did you enable logging at GC level in Elasticsearch? Have you
a strategy for sizing your application, that is, have you calculated in
advance how much resources you will need?
Jörg
Am 02.04.13 18:03, schrieb foufos:
Dear elasticsearch users
I am running a PHP web application whose data layer is based out of 3
elasticsearch nodes.
Once in a while there might be an individual node failing (e.g. recently
one run into a JVM Heap OOM) but the cluster would still become green (2
nodes required) so I would like to make the application fault tollerant.
What is the best practice to avoid sending requests to the instance that
fails?
Would you implement healthchecks at the application layer?
another solution might be to run a client node on your web application
server. This is an elasticsearch node, which does not hold any data and is
not allowed to become master, but still knows the clusters internal
structure and which nodes can be queried (and a little bit more). There is
a comment about that configuration in the default elasticsearch.yml
configuration as well (which is not the best place to put it obvously).
I am running a PHP web application whose data layer is based out of 3
elasticsearch nodes.
Once in a while there might be an individual node failing (e.g. recently
one run into a JVM Heap OOM) but the cluster would still become green (2
nodes required) so I would like to make the application fault tollerant.
What is the best practice to avoid sending requests to the instance that
fails?
Would you implement healthchecks at the application layer?
@jorge What are the most common causes for OOM in ES?
@foufos I know if you use the java client and all addresses to the
transport client it will manage this for you. otherwise you can just have a
list of servers to try your request against if you don't want a load
balancer (I am assuming those would be the only ways if you opt for using
the REST api via http)
On Thursday, April 4, 2013 10:33:33 AM UTC+4, Alexander Reelsen wrote:
Hey
another solution might be to run a client node on your web application
server. This is an elasticsearch node, which does not hold any data and is
not allowed to become master, but still knows the clusters internal
structure and which nodes can be queried (and a little bit more). There is
a comment about that configuration in the default elasticsearch.yml
configuration as well (which is not the best place to put it obvously).
--Alex
On Tue, Apr 2, 2013 at 6:03 PM, foufos <fou...@gmail.com <javascript:>>wrote:
Dear elasticsearch users
I am running a PHP web application whose data layer is based out of 3
elasticsearch nodes.
Once in a while there might be an individual node failing (e.g. recently
one run into a JVM Heap OOM) but the cluster would still become green (2
nodes required) so I would like to make the application fault tollerant.
What is the best practice to avoid sending requests to the instance that
fails?
Would you implement healthchecks at the application layer?
Any examples or advice would be much appreciated
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.
In ES it has to be considered for what workloads heap space is required:
for large segment merging. The bigger the index grows, the more heap
is required for segment merging
for large documents and large bulks while indexing
for large result sets
and for field caching for filtering and faceting
Finding a reasonable heap size requires some testing under different
workloads. There is no general rule for a "correct" heap size.
You can tackle OOM with scaling out (adding more nodes) or scaling up
(add more RAM per node) or streamline the resource consumption during
the lifecycle of the ES process (smaller segments while merging, smaller
bulk indexing, smaller query results, avoiding "bad" queries with too
heavy resource consumption)
Jörg
Am 04.04.2013 11:37, schrieb Mo:
@jorge What are the most common causes for OOM in ES?
After investigation it turns out we had a lot of exceptions due to wrong
mapping attributes.
After fixing this, we haven't experienced another similar issue.
can this cause OOM?
@Jorg We have about 850K documents fairly small in size. and we also have
routing set up to have less overhead.
So the problem is temporarily not triggered but we have to create a fall
back in another server becomes unresponsive and we get another split brain
scenario
So now we are considering of implementing a solution along the lines @alex
suggested.
You think that by doing something like that you can avoid a split brain
scenario?
thank you
foufos
On Thursday, 4 April 2013 15:28:03 UTC+3, Jörg Prante wrote:
OOM happens when the heap size is not sufficient.
In ES it has to be considered for what workloads heap space is required:
for large segment merging. The bigger the index grows, the more heap
is required for segment merging
for large documents and large bulks while indexing
for large result sets
and for field caching for filtering and faceting
Finding a reasonable heap size requires some testing under different
workloads. There is no general rule for a "correct" heap size.
You can tackle OOM with scaling out (adding more nodes) or scaling up
(add more RAM per node) or streamline the resource consumption during
the lifecycle of the ES process (smaller segments while merging, smaller
bulk indexing, smaller query results, avoiding "bad" queries with too
heavy resource consumption)
Jörg
Am 04.04.2013 11:37, schrieb Mo:
@jorge What are the most common causes for OOM in ES?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.