Frustrated with REST API

crimondi · August 2, 2014, 2:46pm

This is going to sound like a bit of a gripe session so I apologize in
advance. There seems to be a lot of instability and ineffectiveness around
using the REST API to make configuration changes. I realize there have been
some issues related to the NPE returns on certain calls. In addition to
those problems (which I believe have been addressed in ES 1.2.x and 1.3.x)
I have found that if the cluster is in anything but a pristine state the
calls simply do not return or error out with a 503 response.

Activities such as changing the number of replicas on certain indices or
modifying throttle settings almost always return a 503 on a cluster that is
yellow. It is when the cluster is in a degraded state that we need these
calls the most! Also, simple information calls such as /_cat/nodes will
many times not return when the cluster is yellow. Sometimes it appears that
an API call is hanging only to find out that the setting really did take.

We maintain multiple ES clusters internally and all the tooling we have
built around supporting them simply assumes the acknowledgement returned
from the API calls is unreliable. Can we expect better reliability with the
Java APIs? Is there plans to make the RESTful calls more robust?

Thanks,

Chris

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6c44ad78-d699-4ec0-937c-15322914f924%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jprante · August 2, 2014, 2:54pm

This looks a a network configuration challenge. You should review your
whole network setup. Look at network device config, host names, gateway
setup, DNS name resolution, routers/switches (if your ES clusters spans
over subnetworks).

If your network setup is not 100% solid, requests over the wire will
timeout, fail, hang, etc.

ES (not only ES) can not remedy such situations from the inside of an
application. You should not blame REST API, with Java API, it would be the
same situation.

Jörg

On Sat, Aug 2, 2014 at 4:46 PM, Chris Rimondi chris.rimondi@gmail.com
wrote:

This is going to sound like a bit of a gripe session so I apologize in
advance. There seems to be a lot of instability and ineffectiveness around
using the REST API to make configuration changes. I realize there have been
some issues related to the NPE returns on certain calls. In addition to
those problems (which I believe have been addressed in ES 1.2.x and 1.3.x)
I have found that if the cluster is in anything but a pristine state the
calls simply do not return or error out with a 503 response.

Activities such as changing the number of replicas on certain indices or
modifying throttle settings almost always return a 503 on a cluster that is
yellow. It is when the cluster is in a degraded state that we need these
calls the most! Also, simple information calls such as /_cat/nodes will
many times not return when the cluster is yellow. Sometimes it appears that
an API call is hanging only to find out that the setting really did take.

We maintain multiple ES clusters internally and all the tooling we have
built around supporting them simply assumes the acknowledgement returned
from the API calls is unreliable. Can we expect better reliability with the
Java APIs? Is there plans to make the RESTful calls more robust?

Thanks,

Chris

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6c44ad78-d699-4ec0-937c-15322914f924%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/6c44ad78-d699-4ec0-937c-15322914f924%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEU%2BtvKwO4yxq4050Ne7pYntDjPLRFU%3D6fbO5%2BuANcrVA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

crimondi · August 2, 2014, 3:10pm

I have considered network before but I don't understand a few things:

Why would these API calls work consistently if the cluster was green
versus yellow or red?
Before I make these calls I can check the health of the cluster and 100%
of the nodes are "checked in". Therefore it is not like network
connectivity is preventing my data nodes from communicating to the masters.
We run multiple clusters. Some are split across subnets and some are
not. However, this API badness takes place in both scenarios.

In short I have seen 0 evidence where network latency or flakiness has
caused any cluster issues. I am open to any type of testing that might
disprove that, but I really don't think this is network connectivity
problems.

On Saturday, August 2, 2014 10:54:41 AM UTC-4, Jörg Prante wrote:

This looks a a network configuration challenge. You should review your
whole network setup. Look at network device config, host names, gateway
setup, DNS name resolution, routers/switches (if your ES clusters spans
over subnetworks).

If your network setup is not 100% solid, requests over the wire will
timeout, fail, hang, etc.

ES (not only ES) can not remedy such situations from the inside of an
application. You should not blame REST API, with Java API, it would be the
same situation.

Jörg

On Sat, Aug 2, 2014 at 4:46 PM, Chris Rimondi <chris....@gmail.com
<javascript:>> wrote:

This is going to sound like a bit of a gripe session so I apologize in
advance. There seems to be a lot of instability and ineffectiveness around
using the REST API to make configuration changes. I realize there have been
some issues related to the NPE returns on certain calls. In addition to
those problems (which I believe have been addressed in ES 1.2.x and 1.3.x)
I have found that if the cluster is in anything but a pristine state the
calls simply do not return or error out with a 503 response.

Activities such as changing the number of replicas on certain indices or
modifying throttle settings almost always return a 503 on a cluster that is
yellow. It is when the cluster is in a degraded state that we need these
calls the most! Also, simple information calls such as /_cat/nodes will
many times not return when the cluster is yellow. Sometimes it appears that
an API call is hanging only to find out that the setting really did take.

We maintain multiple ES clusters internally and all the tooling we have
built around supporting them simply assumes the acknowledgement returned
from the API calls is unreliable. Can we expect better reliability with the
Java APIs? Is there plans to make the RESTful calls more robust?

Thanks,

Chris

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6c44ad78-d699-4ec0-937c-15322914f924%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/6c44ad78-d699-4ec0-937c-15322914f924%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/29b5bf60-e08a-4885-a796-c0590f338da7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jprante · August 2, 2014, 6:28pm

Because I don't know your system setup, I have to do wild guessing.

If it is not the network, can you observe frequent index recovery or shard
reallocations? 503 (service unavailable) is a code that is sent when shards
are not available.

Have you set replica level by any chance or are indices replicas disabled?

Jörg

On Sat, Aug 2, 2014 at 5:10 PM, Chris Rimondi chris.rimondi@gmail.com
wrote:

I have considered network before but I don't understand a few things:

Why would these API calls work consistently if the cluster was green
versus yellow or red?

Before I make these calls I can check the health of the cluster and
100% of the nodes are "checked in". Therefore it is not like network
connectivity is preventing my data nodes from communicating to the masters.

We run multiple clusters. Some are split across subnets and some are
not. However, this API badness takes place in both scenarios.

In short I have seen 0 evidence where network latency or flakiness has
caused any cluster issues. I am open to any type of testing that might
disprove that, but I really don't think this is network connectivity
problems.

On Saturday, August 2, 2014 10:54:41 AM UTC-4, Jörg Prante wrote:

This looks a a network configuration challenge. You should review your
whole network setup. Look at network device config, host names, gateway
setup, DNS name resolution, routers/switches (if your ES clusters spans
over subnetworks).

If your network setup is not 100% solid, requests over the wire will
timeout, fail, hang, etc.

ES (not only ES) can not remedy such situations from the inside of an
application. You should not blame REST API, with Java API, it would be the
same situation.

Jörg

On Sat, Aug 2, 2014 at 4:46 PM, Chris Rimondi chris....@gmail.com
wrote:

This is going to sound like a bit of a gripe session so I apologize in
advance. There seems to be a lot of instability and ineffectiveness around
using the REST API to make configuration changes. I realize there have been
some issues related to the NPE returns on certain calls. In addition to
those problems (which I believe have been addressed in ES 1.2.x and 1.3.x)
I have found that if the cluster is in anything but a pristine state the
calls simply do not return or error out with a 503 response.

Activities such as changing the number of replicas on certain indices or
modifying throttle settings almost always return a 503 on a cluster that is
yellow. It is when the cluster is in a degraded state that we need these
calls the most! Also, simple information calls such as /_cat/nodes will
many times not return when the cluster is yellow. Sometimes it appears that
an API call is hanging only to find out that the setting really did take.

We maintain multiple ES clusters internally and all the tooling we have
built around supporting them simply assumes the acknowledgement returned
from the API calls is unreliable. Can we expect better reliability with the
Java APIs? Is there plans to make the RESTful calls more robust?

Thanks,

Chris

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/6c44ad78-d699-4ec0-937c-15322914f924%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/6c44ad78-d699-4ec0-937c-15322914f924%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/29b5bf60-e08a-4885-a796-c0590f338da7%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/29b5bf60-e08a-4885-a796-c0590f338da7%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH5-agRncSA2dFioAKAsKOph7GPpp7%2BCRYZnUYk75OOAQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Elasticsearch cluster instability Elasticsearch	13	2821	July 6, 2017
For an I/O block on a machine in the ES cluster, how can I set it so that ES returns a 504 timeout and then tries to poll the node to retry Elasticsearch	1	276	April 28, 2021
ES java api: how to handle connectivity problems? Elasticsearch	11	1526	July 6, 2017
New Elasticsearch 7.6.0 cluster eventually becomes unresponsive Elasticsearch	3	369	April 13, 2020
Certain rest requests time out Elasticsearch	15	446	July 6, 2017

Frustrated with REST API

Related topics