NoNodeAvailableException when indexing from Hadoop


(phobos182) #1

I'm wondering what is the best way to set the TransportClient when indexing en masse from Hadoop. Right now I have set the client to sniff the cluster, and set just one hostname of a server in the ElasticSearch cluster. Is there a better way to load balance discovery of the cluster?

I'm getting a lot of these errors when trying to index.

Caused by: org.elasticsearch.client.transport.NoNodeAvailableException: No node available
at org.elasticsearch.client.transport.TransportClientNodesService.execute(TransportClientNodesService.java:149)
at org.elasticsearch.client.transport.support.InternalTransportClient.bulk(InternalTransportClient.java:170)
at org.elasticsearch.client.transport.TransportClient.bulk(TransportClient.java:266)
at org.elasticsearch.client.action.bulk.BulkRequestBuilder.doExecute(BulkRequestBuilder.java:122)
at org.elasticsearch.client.action.support.BaseRequestBuilder.execute(BaseRequestBuilder.java:56)
at org.elasticsearch.client.action.support.BaseRequestBuilder.execute(BaseRequestBuilder.java:51)
... 15 more
Thanks,


(Shay Banon) #2

That should be fine. When do you get this exception? Can you gist it?
On Wednesday, May 4, 2011 at 7:26 PM, phobos182 wrote:

I'm wondering what is the best way to set the TransportClient when indexing
en masse from Hadoop. Right now I have set the client to sniff the cluster,
and set just one hostname of a server in the ElasticSearch cluster. Is there
a better way to load balance discovery of the cluster?

Thanks,

--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/NoNodeAvailableException-when-indexing-from-Hadoop-tp2899518p2899518.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(phobos182) #3

Sorry for the late reply.

Here is the exception. We are attempting to use a Hive UDF to index to ElasticSearch. It fails with this error below.

TransportClient building is very simple. Therefore I do not know why the list it discovers is empty.


(lukeforehand-2) #4

To further expand on the explanation of this issue, when the UDF is
initialized, a TransportClient is initialized with sniffing on. An
additional check in our init code verifies that all nodes have been
discovered. During the hive query as the UDF is progressing, the
NoNodeAvailableException is thrown. Looking at the ES source code,
this exception can happen if the discovered node list is empty (this
doesn't seem to be the issue) or all discovered nodes fail to perform
the request (ConnectTransportException is caught and ignored).

Since we check the discovered node list before processing, I'm
assuming all discovered nodes are failing with
ConnectTransportException. Each request is a bulk request with bulk
size of 1000 docs. This is a preliminary test with 90 Hadoop mapper
threads communicating with a 4 server ElasticSearch cluster. Can you
provide any insight on how we could continue to debug this issue?

Thanks!
-Luke Forehand

On May 7, 2:14 am, Shay Banon shay.ba...@elasticsearch.com wrote:

That should be fine. When do you get this exception? Can you gist it?

On Wednesday, May 4, 2011 at 7:26 PM, phobos182 wrote:

I'm wondering what is the best way to set the TransportClient when indexing
en masse from Hadoop. Right now I have set the client to sniff the cluster,
and set just one hostname of a server in the ElasticSearch cluster. Is there
a better way to load balance discovery of the cluster?

Thanks,

--
View this message in context:http://elasticsearch-users.115913.n3.nabble.com/NoNodeAvailableExcept...
Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(Rich Kroll) #5

This could also be thrown if the nodes are expecting a different cluster
name. Try turning up the debug level on the client, it will likely tell you
the cause.
On May 9, 2011 3:26 PM, "lukeforehand" lukeforehand@gmail.com wrote:

To further expand on the explanation of this issue, when the UDF is
initialized, a TransportClient is initialized with sniffing on. An
additional check in our init code verifies that all nodes have been
discovered. During the hive query as the UDF is progressing, the
NoNodeAvailableException is thrown. Looking at the ES source code,
this exception can happen if the discovered node list is empty (this
doesn't seem to be the issue) or all discovered nodes fail to perform
the request (ConnectTransportException is caught and ignored).

Since we check the discovered node list before processing, I'm
assuming all discovered nodes are failing with
ConnectTransportException. Each request is a bulk request with bulk
size of 1000 docs. This is a preliminary test with 90 Hadoop mapper
threads communicating with a 4 server ElasticSearch cluster. Can you
provide any insight on how we could continue to debug this issue?

Thanks!
-Luke Forehand

On May 7, 2:14 am, Shay Banon shay.ba...@elasticsearch.com wrote:

That should be fine. When do you get this exception? Can you gist it?

On Wednesday, May 4, 2011 at 7:26 PM, phobos182 wrote:

I'm wondering what is the best way to set the TransportClient when
indexing

en masse from Hadoop. Right now I have set the client to sniff the
cluster,

and set just one hostname of a server in the ElasticSearch cluster. Is
there

a better way to load balance discovery of the cluster?

Thanks,

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/NoNodeAvailableExcept...

Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(lukeforehand-2) #6

The solution was to turn off JVM re-use in the map-reduce job. We are
looking into the exact cause now but this clue might lead someone else
to the root cause as I look at the source code.

-Luke

On May 9, 5:40 pm, Rich Kroll kroll.r...@gmail.com wrote:

This could also be thrown if the nodes are expecting a different cluster
name. Try turning up the debug level on the client, it will likely tell you
the cause.
On May 9, 2011 3:26 PM, "lukeforehand" lukeforeh...@gmail.com wrote:

To further expand on the explanation of this issue, when the UDF is
initialized, a TransportClient is initialized with sniffing on. An
additional check in our init code verifies that all nodes have been
discovered. During the hive query as the UDF is progressing, the
NoNodeAvailableException is thrown. Looking at the ES source code,
this exception can happen if the discovered node list is empty (this
doesn't seem to be the issue) or all discovered nodes fail to perform
the request (ConnectTransportException is caught and ignored).

Since we check the discovered node list before processing, I'm
assuming all discovered nodes are failing with
ConnectTransportException. Each request is a bulk request with bulk
size of 1000 docs. This is a preliminary test with 90 Hadoop mapper
threads communicating with a 4 server ElasticSearch cluster. Can you
provide any insight on how we could continue to debug this issue?

Thanks!
-Luke Forehand

On May 7, 2:14 am, Shay Banon shay.ba...@elasticsearch.com wrote:

That should be fine. When do you get this exception? Can you gist it?

On Wednesday, May 4, 2011 at 7:26 PM, phobos182 wrote:

I'm wondering what is the best way to set the TransportClient when
indexing

en masse from Hadoop. Right now I have set the client to sniff the
cluster,

and set just one hostname of a server in the ElasticSearch cluster. Is
there

a better way to load balance discovery of the cluster?

Thanks,

--
View this message in context:

http://elasticsearch-users.115913.n3.nabble.com/NoNodeAvailableExcept...

Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(Pat Christopher) #7

What does turning off JVM re-use do to memory consumption for the UDF?

Pat

On May 12, 8:33 am, lukeforehand lukeforeh...@gmail.com wrote:

The solution was to turn off JVM re-use in the map-reduce job. We are
looking into the exact cause now but this clue might lead someone else
to the root cause as I look at the source code.

-Luke

On May 9, 5:40 pm, Rich Kroll kroll.r...@gmail.com wrote:

This could also be thrown if the nodes are expecting a different cluster
name. Try turning up the debug level on the client, it will likely tell you
the cause.
On May 9, 2011 3:26 PM, "lukeforehand" lukeforeh...@gmail.com wrote:

To further expand on the explanation of this issue, when the UDF is
initialized, a TransportClient is initialized with sniffing on. An
additional check in our init code verifies that all nodes have been
discovered. During the hive query as the UDF is progressing, the
NoNodeAvailableException is thrown. Looking at the ES source code,
this exception can happen if the discovered node list is empty (this
doesn't seem to be the issue) or all discovered nodes fail to perform
the request (ConnectTransportException is caught and ignored).

Since we check the discovered node list before processing, I'm
assuming all discovered nodes are failing with
ConnectTransportException. Each request is a bulk request with bulk
size of 1000 docs. This is a preliminary test with 90 Hadoop mapper
threads communicating with a 4 server ElasticSearch cluster. Can you
provide any insight on how we could continue to debug this issue?

Thanks!
-Luke Forehand

On May 7, 2:14 am, Shay Banon shay.ba...@elasticsearch.com wrote:

That should be fine. When do you get this exception? Can you gist it?

On Wednesday, May 4, 2011 at 7:26 PM, phobos182 wrote:

I'm wondering what is the best way to set the TransportClient when
indexing

en masse from Hadoop. Right now I have set the client to sniff the
cluster,

and set just one hostname of a server in the ElasticSearch cluster. Is
there

a better way to load balance discovery of the cluster?

Thanks,

--
View this message in context:

http://elasticsearch-users.115913.n3.nabble.com/NoNodeAvailableExcept...

Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(Shay Banon) #8

I don't know what this feature does, I can assume that reusing the same JVM for the same map reduce jobs? I don't see why this would affects creating client nodes and having them not connected. Are you closing the nodes when the job is done?
On Thursday, May 12, 2011 at 6:33 PM, lukeforehand wrote:

The solution was to turn off JVM re-use in the map-reduce job. We are
looking into the exact cause now but this clue might lead someone else
to the root cause as I look at the source code.

-Luke

On May 9, 5:40 pm, Rich Kroll kroll.r...@gmail.com wrote:

This could also be thrown if the nodes are expecting a different cluster
name. Try turning up the debug level on the client, it will likely tell you
the cause.
On May 9, 2011 3:26 PM, "lukeforehand" lukeforeh...@gmail.com wrote:

To further expand on the explanation of this issue, when the UDF is
initialized, a TransportClient is initialized with sniffing on. An
additional check in our init code verifies that all nodes have been
discovered. During the hive query as the UDF is progressing, the
NoNodeAvailableException is thrown. Looking at the ES source code,
this exception can happen if the discovered node list is empty (this
doesn't seem to be the issue) or all discovered nodes fail to perform
the request (ConnectTransportException is caught and ignored).

Since we check the discovered node list before processing, I'm
assuming all discovered nodes are failing with
ConnectTransportException. Each request is a bulk request with bulk
size of 1000 docs. This is a preliminary test with 90 Hadoop mapper
threads communicating with a 4 server ElasticSearch cluster. Can you
provide any insight on how we could continue to debug this issue?

Thanks!
-Luke Forehand

On May 7, 2:14 am, Shay Banon shay.ba...@elasticsearch.com wrote:

That should be fine. When do you get this exception? Can you gist it?

On Wednesday, May 4, 2011 at 7:26 PM, phobos182 wrote:

I'm wondering what is the best way to set the TransportClient when
indexing

en masse from Hadoop. Right now I have set the client to sniff the
cluster,

and set just one hostname of a server in the ElasticSearch cluster. Is
there

a better way to load balance discovery of the cluster?

Thanks,

--
View this message in context:

http://elasticsearch-users.115913.n3.nabble.com/NoNodeAvailableExcept...

Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(phobos182) #9

The property in question for Hadoop is mapred.tasks.jvm.reuse

It's a performance optimization for tasks that are short lived (<30 seconds) to decrease the overhead of spawning additional child JVM's. It takes about 1 second to spawn a JVM, so if you have a lot of tasks that are say, taking 3 seconds. Then setting jvm reuse to greater than 1 can be a big performance factor as most of the time is spent in creating the JVM for tasks. In this particular job we did not have short lived tasks, but we had an incorrect default to reuse the JVM 4 times.

I'm not quite sure at this time if there was anything being left open. Not what I can tell initially. I'm thinking something in the UDF was not closing the function correctly.

There was no impact on memory utilization as the number of documents per map task did not change. We were just holding 1,000 documents in memory for the bulk request then getting rid of them. So the job runs just as fast as ever, but with no stability / memory issues. Now we are on to performance tuning our schema and fields.


(system) #10