Search in plugin hangs

Mauricio_Scheffer · February 10, 2015, 12:27pm

Hi, I'm writing a plugin that implements a ScoreFunction that needs to look
up some data from a separate index. It does that by having a Client
instance injected. This works perfectly in my box, but when I deploy it to
an EC2 cluster, one of the nodes simply hangs when calling the Client.
The output for /_cat/thread_pool is:

elasticsearch-cluster3.localdomain 127.0.1.1 0 0 0 0 0 0 3 18 0
elasticsearch-cluster2.localdomain 127.0.1.1 0 0 0 0 0 0 0 0 0
elasticsearch-cluster1.localdomain 127.0.1.1 0 0 0 0 0 0 0 0 0

those 3 active requests never finish, and even worse, blocks the node
entirely, it stops responding to all other search requests (which get
queued up and eventually the queue fills up and starts rejecting requests).
There is no CPU usage on that hanging node.
Obviously all the nodes are configured identically (deployed through
opsworks).

Any ideas? I guess injecting Client is not the way to go here? Any
alternatives worth trying?

Thanks,
Mauricio

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0350f575-7c4c-4d88-a471-2ff4d8eeb764%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Mauricio_Scheffer · February 12, 2015, 11:58am

I got this working by creating my own TransportClient instance instead of
using the injected Client.
Still, it would be nice to understand what's going on here, also locking up
the node like this seems like a pretty serious bug.

On Tuesday, February 10, 2015 at 12:27:07 PM UTC, Mauricio Scheffer wrote:

Hi, I'm writing a plugin that implements a ScoreFunction that needs to
look up some data from a separate index. It does that by having a Client
instance injected. This works perfectly in my box, but when I deploy it to
an EC2 cluster, one of the nodes simply hangs when calling the Client.
The output for /_cat/thread_pool is:

elasticsearch-cluster3.localdomain 127.0.1.1 0 0 0 0 0 0 3 18 0
elasticsearch-cluster2.localdomain 127.0.1.1 0 0 0 0 0 0 0 0 0
elasticsearch-cluster1.localdomain 127.0.1.1 0 0 0 0 0 0 0 0 0

those 3 active requests never finish, and even worse, blocks the node
entirely, it stops responding to all other search requests (which get
queued up and eventually the queue fills up and starts rejecting requests).
There is no CPU usage on that hanging node.
Obviously all the nodes are configured identically (deployed through
opsworks).

Any ideas? I guess injecting Client is not the way to go here? Any
alternatives worth trying?

Thanks,
Mauricio

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0a29cd69-bb0d-4a60-9563-28fa694f5a85%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jprante · February 12, 2015, 4:45pm

Can you publish your code so it can be reproduced? Then you might get
feedback.

Jörg

On Thu, Feb 12, 2015 at 12:58 PM, Mauricio Scheffer <
mauricioscheffer@gmail.com> wrote:

I got this working by creating my own TransportClient instance instead of
using the injected Client.
Still, it would be nice to understand what's going on here, also locking
up the node like this seems like a pretty serious bug.

On Tuesday, February 10, 2015 at 12:27:07 PM UTC, Mauricio Scheffer wrote:

Hi, I'm writing a plugin that implements a ScoreFunction that needs to
look up some data from a separate index. It does that by having a Client
instance injected. This works perfectly in my box, but when I deploy it to
an EC2 cluster, one of the nodes simply hangs when calling the Client.
The output for /_cat/thread_pool is:

elasticsearch-cluster3.localdomain 127.0.1.1 0 0 0 0 0 0 3 18 0
elasticsearch-cluster2.localdomain 127.0.1.1 0 0 0 0 0 0 0 0 0
elasticsearch-cluster1.localdomain 127.0.1.1 0 0 0 0 0 0 0 0 0

those 3 active requests never finish, and even worse, blocks the node
entirely, it stops responding to all other search requests (which get
queued up and eventually the queue fills up and starts rejecting requests).
There is no CPU usage on that hanging node.
Obviously all the nodes are configured identically (deployed through
opsworks).

Any ideas? I guess injecting Client is not the way to go here? Any
alternatives worth trying?

Thanks,
Mauricio

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0a29cd69-bb0d-4a60-9563-28fa694f5a85%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/0a29cd69-bb0d-4a60-9563-28fa694f5a85%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEWYq%3DC9EoEqCQk2FTQ-i7WVnaTUgQC3LvTmGW14VPnQA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Mauricio_Scheffer · February 13, 2015, 5:40pm

I can't publish this code, but I'll see if I find some time to create a
repro.
Basically the case is as simple as I described before: a ScoreFunction that
uses an injected Client instance to query a separate index.

Cheers

--
Mauricio

On Thu, Feb 12, 2015 at 4:45 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

Can you publish your code so it can be reproduced? Then you might get
feedback.

Jörg

On Thu, Feb 12, 2015 at 12:58 PM, Mauricio Scheffer <
mauricioscheffer@gmail.com> wrote:

I got this working by creating my own TransportClient instance instead of
using the injected Client.
Still, it would be nice to understand what's going on here, also locking
up the node like this seems like a pretty serious bug.

On Tuesday, February 10, 2015 at 12:27:07 PM UTC, Mauricio Scheffer wrote:

Hi, I'm writing a plugin that implements a ScoreFunction that needs to
look up some data from a separate index. It does that by having a Client
instance injected. This works perfectly in my box, but when I deploy it to
an EC2 cluster, one of the nodes simply hangs when calling the Client.
The output for /_cat/thread_pool is:

elasticsearch-cluster3.localdomain 127.0.1.1 0 0 0 0 0 0 3 18 0
elasticsearch-cluster2.localdomain 127.0.1.1 0 0 0 0 0 0 0 0 0
elasticsearch-cluster1.localdomain 127.0.1.1 0 0 0 0 0 0 0 0 0

those 3 active requests never finish, and even worse, blocks the node
entirely, it stops responding to all other search requests (which get
queued up and eventually the queue fills up and starts rejecting requests).
There is no CPU usage on that hanging node.
Obviously all the nodes are configured identically (deployed through
opsworks).

Any ideas? I guess injecting Client is not the way to go here? Any
alternatives worth trying?

Thanks,
Mauricio

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0a29cd69-bb0d-4a60-9563-28fa694f5a85%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/0a29cd69-bb0d-4a60-9563-28fa694f5a85%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/5QHqc9RNsZ0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEWYq%3DC9EoEqCQk2FTQ-i7WVnaTUgQC3LvTmGW14VPnQA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEWYq%3DC9EoEqCQk2FTQ-i7WVnaTUgQC3LvTmGW14VPnQA%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAF0_A3EYVmAF9QXmW70UJtYXDs4_F72tfLmaop4msf9TP0UURw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
ElasticSearch.js some how hangs? Elasticsearch	1	642	July 6, 2017
ES 0.20.0 hangs regularly Elasticsearch	6	408	July 6, 2017
ES 5.1.1 node stuck in endless loop halting the whole cluster Elasticsearch	6	1889	February 14, 2017
Transport Client hangs in my web application during search Elasticsearch	4	788	July 6, 2017
Client seems to block/hang when server hangs - v0.18.7 Elasticsearch	3	298	July 6, 2017

Search in plugin hangs

Related topics