I'm wondering what is the best way to set the TransportClient when indexing en masse from Hadoop. Right now I have set the client to sniff the cluster, and set just one hostname of a server in the ElasticSearch cluster. Is there a better way to load balance discovery of the cluster?
I'm getting a lot of these errors when trying to index.
Caused by: org.elasticsearch.client.transport.NoNodeAvailableException: No node available
at org.elasticsearch.client.transport.TransportClientNodesService.execute(TransportClientNodesService.java:149)
at org.elasticsearch.client.transport.support.InternalTransportClient.bulk(InternalTransportClient.java:170)
at org.elasticsearch.client.transport.TransportClient.bulk(TransportClient.java:266)
at org.elasticsearch.client.action.bulk.BulkRequestBuilder.doExecute(BulkRequestBuilder.java:122)
at org.elasticsearch.client.action.support.BaseRequestBuilder.execute(BaseRequestBuilder.java:56)
at org.elasticsearch.client.action.support.BaseRequestBuilder.execute(BaseRequestBuilder.java:51)
... 15 more
Thanks,
That should be fine. When do you get this exception? Can you gist it?
On Wednesday, May 4, 2011 at 7:26 PM, phobos182 wrote:
I'm wondering what is the best way to set the TransportClient when indexing
en masse from Hadoop. Right now I have set the client to sniff the cluster,
and set just one hostname of a server in the Elasticsearch cluster. Is there
a better way to load balance discovery of the cluster?
To further expand on the explanation of this issue, when the UDF is
initialized, a TransportClient is initialized with sniffing on. An
additional check in our init code verifies that all nodes have been
discovered. During the hive query as the UDF is progressing, the
NoNodeAvailableException is thrown. Looking at the ES source code,
this exception can happen if the discovered node list is empty (this
doesn't seem to be the issue) or all discovered nodes fail to perform
the request (ConnectTransportException is caught and ignored).
Since we check the discovered node list before processing, I'm
assuming all discovered nodes are failing with
ConnectTransportException. Each request is a bulk request with bulk
size of 1000 docs. This is a preliminary test with 90 Hadoop mapper
threads communicating with a 4 server Elasticsearch cluster. Can you
provide any insight on how we could continue to debug this issue?
That should be fine. When do you get this exception? Can you gist it?
On Wednesday, May 4, 2011 at 7:26 PM, phobos182 wrote:
I'm wondering what is the best way to set the TransportClient when indexing
en masse from Hadoop. Right now I have set the client to sniff the cluster,
and set just one hostname of a server in the Elasticsearch cluster. Is there
a better way to load balance discovery of the cluster?
This could also be thrown if the nodes are expecting a different cluster
name. Try turning up the debug level on the client, it will likely tell you
the cause.
On May 9, 2011 3:26 PM, "lukeforehand" lukeforehand@gmail.com wrote:
To further expand on the explanation of this issue, when the UDF is
initialized, a TransportClient is initialized with sniffing on. An
additional check in our init code verifies that all nodes have been
discovered. During the hive query as the UDF is progressing, the
NoNodeAvailableException is thrown. Looking at the ES source code,
this exception can happen if the discovered node list is empty (this
doesn't seem to be the issue) or all discovered nodes fail to perform
the request (ConnectTransportException is caught and ignored).
Since we check the discovered node list before processing, I'm
assuming all discovered nodes are failing with
ConnectTransportException. Each request is a bulk request with bulk
size of 1000 docs. This is a preliminary test with 90 Hadoop mapper
threads communicating with a 4 server Elasticsearch cluster. Can you
provide any insight on how we could continue to debug this issue?
That should be fine. When do you get this exception? Can you gist it?
On Wednesday, May 4, 2011 at 7:26 PM, phobos182 wrote:
I'm wondering what is the best way to set the TransportClient when
indexing
en masse from Hadoop. Right now I have set the client to sniff the
cluster,
and set just one hostname of a server in the Elasticsearch cluster. Is
there
a better way to load balance discovery of the cluster?
The solution was to turn off JVM re-use in the map-reduce job. We are
looking into the exact cause now but this clue might lead someone else
to the root cause as I look at the source code.
This could also be thrown if the nodes are expecting a different cluster
name. Try turning up the debug level on the client, it will likely tell you
the cause.
On May 9, 2011 3:26 PM, "lukeforehand" lukeforeh...@gmail.com wrote:
To further expand on the explanation of this issue, when the UDF is
initialized, a TransportClient is initialized with sniffing on. An
additional check in our init code verifies that all nodes have been
discovered. During the hive query as the UDF is progressing, the
NoNodeAvailableException is thrown. Looking at the ES source code,
this exception can happen if the discovered node list is empty (this
doesn't seem to be the issue) or all discovered nodes fail to perform
the request (ConnectTransportException is caught and ignored).
Since we check the discovered node list before processing, I'm
assuming all discovered nodes are failing with
ConnectTransportException. Each request is a bulk request with bulk
size of 1000 docs. This is a preliminary test with 90 Hadoop mapper
threads communicating with a 4 server Elasticsearch cluster. Can you
provide any insight on how we could continue to debug this issue?
That should be fine. When do you get this exception? Can you gist it?
On Wednesday, May 4, 2011 at 7:26 PM, phobos182 wrote:
I'm wondering what is the best way to set the TransportClient when
indexing
en masse from Hadoop. Right now I have set the client to sniff the
cluster,
and set just one hostname of a server in the Elasticsearch cluster. Is
there
a better way to load balance discovery of the cluster?
The solution was to turn off JVM re-use in the map-reduce job. We are
looking into the exact cause now but this clue might lead someone else
to the root cause as I look at the source code.
This could also be thrown if the nodes are expecting a different cluster
name. Try turning up the debug level on the client, it will likely tell you
the cause.
On May 9, 2011 3:26 PM, "lukeforehand" lukeforeh...@gmail.com wrote:
To further expand on the explanation of this issue, when the UDF is
initialized, a TransportClient is initialized with sniffing on. An
additional check in our init code verifies that all nodes have been
discovered. During the hive query as the UDF is progressing, the
NoNodeAvailableException is thrown. Looking at the ES source code,
this exception can happen if the discovered node list is empty (this
doesn't seem to be the issue) or all discovered nodes fail to perform
the request (ConnectTransportException is caught and ignored).
Since we check the discovered node list before processing, I'm
assuming all discovered nodes are failing with
ConnectTransportException. Each request is a bulk request with bulk
size of 1000 docs. This is a preliminary test with 90 Hadoop mapper
threads communicating with a 4 server Elasticsearch cluster. Can you
provide any insight on how we could continue to debug this issue?
That should be fine. When do you get this exception? Can you gist it?
On Wednesday, May 4, 2011 at 7:26 PM, phobos182 wrote:
I'm wondering what is the best way to set the TransportClient when
indexing
en masse from Hadoop. Right now I have set the client to sniff the
cluster,
and set just one hostname of a server in the Elasticsearch cluster. Is
there
a better way to load balance discovery of the cluster?
I don't know what this feature does, I can assume that reusing the same JVM for the same map reduce jobs? I don't see why this would affects creating client nodes and having them not connected. Are you closing the nodes when the job is done?
On Thursday, May 12, 2011 at 6:33 PM, lukeforehand wrote:
The solution was to turn off JVM re-use in the map-reduce job. We are
looking into the exact cause now but this clue might lead someone else
to the root cause as I look at the source code.
This could also be thrown if the nodes are expecting a different cluster
name. Try turning up the debug level on the client, it will likely tell you
the cause.
On May 9, 2011 3:26 PM, "lukeforehand" lukeforeh...@gmail.com wrote:
To further expand on the explanation of this issue, when the UDF is
initialized, a TransportClient is initialized with sniffing on. An
additional check in our init code verifies that all nodes have been
discovered. During the hive query as the UDF is progressing, the
NoNodeAvailableException is thrown. Looking at the ES source code,
this exception can happen if the discovered node list is empty (this
doesn't seem to be the issue) or all discovered nodes fail to perform
the request (ConnectTransportException is caught and ignored).
Since we check the discovered node list before processing, I'm
assuming all discovered nodes are failing with
ConnectTransportException. Each request is a bulk request with bulk
size of 1000 docs. This is a preliminary test with 90 Hadoop mapper
threads communicating with a 4 server Elasticsearch cluster. Can you
provide any insight on how we could continue to debug this issue?
That should be fine. When do you get this exception? Can you gist it?
On Wednesday, May 4, 2011 at 7:26 PM, phobos182 wrote:
I'm wondering what is the best way to set the TransportClient when
indexing
en masse from Hadoop. Right now I have set the client to sniff the
cluster,
and set just one hostname of a server in the Elasticsearch cluster. Is
there
a better way to load balance discovery of the cluster?
The property in question for Hadoop is mapred.tasks.jvm.reuse
It's a performance optimization for tasks that are short lived (<30 seconds) to decrease the overhead of spawning additional child JVM's. It takes about 1 second to spawn a JVM, so if you have a lot of tasks that are say, taking 3 seconds. Then setting jvm reuse to greater than 1 can be a big performance factor as most of the time is spent in creating the JVM for tasks. In this particular job we did not have short lived tasks, but we had an incorrect default to reuse the JVM 4 times.
I'm not quite sure at this time if there was anything being left open. Not what I can tell initially. I'm thinking something in the UDF was not closing the function correctly.
There was no impact on memory utilization as the number of documents per map task did not change. We were just holding 1,000 documents in memory for the bulk request then getting rid of them. So the job runs just as fast as ever, but with no stability / memory issues. Now we are on to performance tuning our schema and fields.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.