Leaked TPC file descriptors for Hadoop Gateway?

Carl_C · September 27, 2012, 10:56pm

Hi everyone,

We are using ElasticSearch to index documents that are stored in HBase, and
so I have set up ElasticSearch to use the hadoop gateway pointing to our
hdfs cluster:

gateway.type: hdfs
gateway.hdfs.uri: hdfs://:8020
gateway.hdfs.path: /elasticsearch

I've been running load tests on the system and find that after a day or
two, ES will hit its open file limit (presently set to 64,000). I've looked
at the output of 'sudo lsof -u elasticsearch' and there are many thousands
of open TCP connections in the CLOSE_WAIT state:

java 14109 elasticsearch 661u IPv6 9820568 0t0
TCP :38257->:50010 (CLOSE_WAIT)

I'm just starting to look into the problem more closely, but I found there
was a previous mailing list discussion about essentially the same issue:
http://elasticsearch-users.115913.n3.nabble.com/CLOSE-WAIT-Sockets-td1884117.html.
There wasn't any resolution, so I thought I'd ask here if anyone has seen
or resolved this problem since.

Some more pertinent information:

ElasticSearch version is 0.19.9
We use the Cloudera hadoop distribution rather than the Apache
distribution. As a result, we cannot use the hadoop-core jar included with
the elasticsearch-hadoop plugin. I have worked around the problem by
prepending the locations of the correct jar files to ES_CLASSPATH in
elasticsearch.in.sh. I have a feeling this may be related to the problem,
but that's nothing more than a vague hunch.

Anyway, if anyone has seen similar problems or has suggestions for
debugging strategies, let me know.

Thanks,
Carl

--

Carl_C · September 27, 2012, 10:58pm

Whoops, clearly the title should be "Leaked TCP file descriptors"

On Thursday, September 27, 2012 3:56:44 PM UTC-7, Carl C wrote:

Hi everyone,

We are using Elasticsearch to index documents that are stored in HBase,
and so I have set up Elasticsearch to use the hadoop gateway pointing to
our hdfs cluster:

gateway.type: hdfs
gateway.hdfs.uri: hdfs://:8020
gateway.hdfs.path: /elasticsearch

I've been running load tests on the system and find that after a day or
two, ES will hit its open file limit (presently set to 64,000). I've looked
at the output of 'sudo lsof -u elasticsearch' and there are many thousands
of open TCP connections in the CLOSE_WAIT state:

java 14109 elasticsearch 661u IPv6 9820568 0t0
TCP :38257->:50010 (CLOSE_WAIT)

I'm just starting to look into the problem more closely, but I found
there was a previous mailing list discussion about essentially the same
issue:
http://elasticsearch-users.115913.n3.nabble.com/CLOSE-WAIT-Sockets-td1884117.html.
There wasn't any resolution, so I thought I'd ask here if anyone has seen
or resolved this problem since.

Some more pertinent information:

Elasticsearch version is 0.19.9

We use the Cloudera hadoop distribution rather than the Apache
distribution. As a result, we cannot use the hadoop-core jar included with
the elasticsearch-hadoop plugin. I have worked around the problem by
prepending the locations of the correct jar files to ES_CLASSPATH in
elasticsearch.in.sh. I have a feeling this may be related to the problem,
but that's nothing more than a vague hunch.

Anyway, if anyone has seen similar problems or has suggestions for
debugging strategies, let me know.

Thanks,
Carl

--

Topic		Replies	Views
CLOSE_WAIT Sockets Elasticsearch	6	1563	July 6, 2017
Unable to connect to Hadoop from ES Elasticsearch	1	416	July 6, 2017
Hadoop hdfs newScalingExecutorService Elasticsearch	4	347	July 6, 2017
Problems getting hdfs gateway configured/started Elasticsearch	3	1465	July 6, 2017
HDFS setup issues Elasticsearch	4	785	July 6, 2017

Leaked TPC file descriptors for Hadoop Gateway?

Related topics