Elasticsearch-hadoop sporadic timeouts

Hi - we're having problems with one of our map-reduce jobs that writes to
Elasticsearch. Lots of map tasks are failing due to ES being "unavailable",
with logs like this:

https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt

Seems like elasticsearch-hadoop tries talking to an ES node, it times out,
tries the next one, it times out, etc until all nodes in the cluster are
exhausted and then it gives up.

As far as I can tell, the ES cluster is healthy while this is occurring.
May map tasks are succeeding - probably about 10% of the attempts are
killed due to this issue. The main problem is that these killed tasks waste
a lot of time, and slow down the overall job execution.

I'm not sure where to troubleshoot this next. Does anyone have any idea
what would cause all of these time outs & failures?

I'm also curious about the lines like this:

2014-09-30 12:49:20,469 WARN org.apache.commons.httpclient.SimpleHttpConnectionManager: SimpleHttpConnectionManager being used incorrectly. Be sure that HttpMethod.releaseConnection() is always called and that only one thread and/or method is using this connection manager at a time.

Would that be related to the timeout problem we're seeing?

Thanks,
Zach

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

What version of es-hadoop/es/cascading are you using?

On 9/30/14 6:16 PM, Zach Cox wrote:

Hi - we're having problems with one of our map-reduce jobs that writes to Elasticsearch. Lots of map tasks are failing
due to ES being "unavailable", with logs like this:

https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt

Seems like elasticsearch-hadoop tries talking to an ES node, it times out, tries the next one, it times out, etc until
all nodes in the cluster are exhausted and then it gives up.

As far as I can tell, the ES cluster is healthy while this is occurring. May map tasks are succeeding - probably about
10% of the attempts are killed due to this issue. The main problem is that these killed tasks waste a lot of time, and
slow down the overall job execution.

I'm not sure where to troubleshoot this next. Does anyone have any idea what would cause all of these time outs & failures?

I'm also curious about the lines like this:

2014-09-30 12:49:20,469 WARN org.apache.commons.httpclient.SimpleHttpConnectionManager: SimpleHttpConnectionManager being used incorrectly. Be sure that HttpMethod.releaseConnection() is always called and that only one thread and/or method is using this connection manager at a time.

Would that be related to the timeout problem we're seeing?

Thanks,
Zach

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.
--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/542ACB45.3030805%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi Costin:

elasticsearch-hadoop 2.0.0
cascading 2.5.4
scalding 0.10.0

Thanks,
Zach

On Tuesday, September 30, 2014 10:25:10 AM UTC-5, Costin Leau wrote:

What version of es-hadoop/es/cascading are you using?

On 9/30/14 6:16 PM, Zach Cox wrote:

Hi - we're having problems with one of our map-reduce jobs that writes
to Elasticsearch. Lots of map tasks are failing
due to ES being "unavailable", with logs like this:

https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt

Seems like elasticsearch-hadoop tries talking to an ES node, it times
out, tries the next one, it times out, etc until
all nodes in the cluster are exhausted and then it gives up.

As far as I can tell, the ES cluster is healthy while this is occurring.
May map tasks are succeeding - probably about
10% of the attempts are killed due to this issue. The main problem is
that these killed tasks waste a lot of time, and
slow down the overall job execution.

I'm not sure where to troubleshoot this next. Does anyone have any idea
what would cause all of these time outs & failures?

I'm also curious about the lines like this:

2014-09-30 12:49:20,469 WARN
org.apache.commons.httpclient.SimpleHttpConnectionManager:
SimpleHttpConnectionManager being used incorrectly. Be sure that
HttpMethod.releaseConnection() is always called and that only one thread
and/or method is using this connection manager at a time.

Would that be related to the timeout problem we're seeing?

Thanks,
Zach

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to
elasticsearc...@googlegroups.com <javascript:> <mailto:
elasticsearch+unsubscribe@googlegroups.com <javascript:>>.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer>.

For more options, visit https://groups.google.com/d/optout.
--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Can you please try the 2.0.2.BUILD-SNAPSHOT? I think you might be running into issue #256 which was fixed some time ago
and will be part of the upcoming
2.0.2, 2.1 Beta2.

Cheers,

On 9/30/14 6:43 PM, Zach Cox wrote:

Hi Costin:

elasticsearch-hadoop 2.0.0
cascading 2.5.4
scalding 0.10.0

Thanks,
Zach

On Tuesday, September 30, 2014 10:25:10 AM UTC-5, Costin Leau wrote:

What version of es-hadoop/es/cascading are you using?

On 9/30/14 6:16 PM, Zach Cox wrote:
> Hi - we're having problems with one of our map-reduce jobs that writes to Elasticsearch. Lots of map tasks are failing
> due to ES being "unavailable", with logs like this:
>
>https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt
<https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt>

>
> Seems like elasticsearch-hadoop tries talking to an ES node, it times out, tries the next one, it times out, etc until
> all nodes in the cluster are exhausted and then it gives up.
>
> As far as I can tell, the ES cluster is healthy while this is occurring. May map tasks are succeeding - probably about
> 10% of the attempts are killed due to this issue. The main problem is that these killed tasks waste a lot of time, and
> slow down the overall job execution.
>
> I'm not sure where to troubleshoot this next. Does anyone have any idea what would cause all of these time outs & failures?
>
> I'm also curious about the lines like this:
>
> 2014-09-30 12:49:20,469 WARN org.apache.commons.httpclient.SimpleHttpConnectionManager: SimpleHttpConnectionManager being used incorrectly.  Be sure that
HttpMethod.releaseConnection() is always called and that only one thread and/or method is using this connection
manager at a time.
>
>
> Would that be related to the timeout problem we're seeing?
>
> Thanks,
> Zach
>
> --
> You received this message because you are subscribed to the Google Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
>elasticsearc...@googlegroups.com <javascript:> <mailto:elasticsearch+unsubscribe@googlegroups.com <javascript:>>.
> To view this discussion on the web visit
>https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com>
> <https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer>>.

> For more options, visithttps://groups.google.com/d/optout <https://groups.google.com/d/optout>.
--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/542AD21C.6030802%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi Costin - we updated our dependencies to use elasticsearch-hadoop
2.0.2.BUILD-SNAPSHOT, but that didn't seem to change anything. We're still
seeing the same task failures while trying to write to Elasticsearch. The
only difference in the logs is that now I don't see
the SimpleHttpConnectionManager warnings.

Any ideas what we could try next?

Thanks,
Zach

On Tuesday, September 30, 2014 10:54:27 AM UTC-5, Costin Leau wrote:

Can you please try the 2.0.2.BUILD-SNAPSHOT? I think you might be running
into issue #256 which was fixed some time ago
and will be part of the upcoming
2.0.2, 2.1 Beta2.

Cheers,

On 9/30/14 6:43 PM, Zach Cox wrote:

Hi Costin:

elasticsearch-hadoop 2.0.0
cascading 2.5.4
scalding 0.10.0

Thanks,
Zach

On Tuesday, September 30, 2014 10:25:10 AM UTC-5, Costin Leau wrote:

What version of es-hadoop/es/cascading are you using? 

On 9/30/14 6:16 PM, Zach Cox wrote: 
> Hi - we're having problems with one of our map-reduce jobs that 

writes to Elasticsearch. Lots of map tasks are failing

> due to ES being "unavailable", with logs like this: 
> 
>

https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt

<

https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt>

> 
> Seems like elasticsearch-hadoop tries talking to an ES node, it 

times out, tries the next one, it times out, etc until

> all nodes in the cluster are exhausted and then it gives up. 
> 
> As far as I can tell, the ES cluster is healthy while this is 

occurring. May map tasks are succeeding - probably about

> 10% of the attempts are killed due to this issue. The main problem 

is that these killed tasks waste a lot of time, and

> slow down the overall job execution. 
> 
> I'm not sure where to troubleshoot this next. Does anyone have any 

idea what would cause all of these time outs & failures?

> 
> I'm also curious about the lines like this: 
> 
> 2014-09-30 12:49:20,469 WARN 

org.apache.commons.httpclient.SimpleHttpConnectionManager:
SimpleHttpConnectionManager being used incorrectly. Be sure that

HttpMethod.releaseConnection() is always called and that only one 

thread and/or method is using this connection

manager at a time. 
> 
> 
> Would that be related to the timeout problem we're seeing? 
> 
> Thanks, 
> Zach 
> 
> -- 
> You received this message because you are subscribed to the Google 

Groups "elasticsearch" group.

> To unsubscribe from this group and stop receiving emails from it, 

send an email to

>elasticsearc...@googlegroups.com <javascript:> <mailto:

elasticsearch+unsubscribe@googlegroups.com <javascript:> <javascript:>>.

> To view this discussion on the web visit 
>

https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com

<

https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com>

> <

https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer

<

https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer>>.

> For more options, visithttps://groups.google.com/d/optout <

https://groups.google.com/d/optout>.

-- 
Costin 

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to
elasticsearc...@googlegroups.com <javascript:> <mailto:
elasticsearch+unsubscribe@googlegroups.com <javascript:>>.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com?utm_medium=email&utm_source=footer>.

For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/90b059ef-680e-4b5c-a0c9-dc5e5038205a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

The error indicates the ES nodes don't reply in a timely fashion and thus the connection drops. Based on your logs it
seems to be either a GC or a network issue.
You could try turning on logging in package 'org.elasticsearch.hadoop.rest' to DEBUG.
How many tasks do you have and what's your bulk size/entries number?

On 10/1/14 2:15 PM, Zach Cox wrote:

Hi Costin - we updated our dependencies to use elasticsearch-hadoop 2.0.2.BUILD-SNAPSHOT, but that didn't seem to change
anything. We're still seeing the same task failures while trying to write to Elasticsearch. The only difference in the
logs is that now I don't see the SimpleHttpConnectionManager warnings.

Any ideas what we could try next?

Thanks,
Zach

On Tuesday, September 30, 2014 10:54:27 AM UTC-5, Costin Leau wrote:

Can you please try the 2.0.2.BUILD-SNAPSHOT? I think you might be running into issue #256 which was fixed some time ago
and will be part of the upcoming
2.0.2, 2.1 Beta2.

Cheers,

On 9/30/14 6:43 PM, Zach Cox wrote:
> Hi Costin:
>
> elasticsearch-hadoop 2.0.0
> cascading 2.5.4
> scalding 0.10.0
>
> Thanks,
> Zach
>
>
> On Tuesday, September 30, 2014 10:25:10 AM UTC-5, Costin Leau wrote:
>
>     What version of es-hadoop/es/cascading are you using?
>
>     On 9/30/14 6:16 PM, Zach Cox wrote:
>     > Hi - we're having problems with one of our map-reduce jobs that writes to Elasticsearch. Lots of map tasks are failing
>     > due to ES being "unavailable", with logs like this:
>     >
>     >https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt
<https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt>

>     <https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt
<https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt>>

>
>     >
>     > Seems like elasticsearch-hadoop tries talking to an ES node, it times out, tries the next one, it times out, etc until
>     > all nodes in the cluster are exhausted and then it gives up.
>     >
>     > As far as I can tell, the ES cluster is healthy while this is occurring. May map tasks are succeeding - probably about
>     > 10% of the attempts are killed due to this issue. The main problem is that these killed tasks waste a lot of time, and
>     > slow down the overall job execution.
>     >
>     > I'm not sure where to troubleshoot this next. Does anyone have any idea what would cause all of these time outs & failures?
>     >
>     > I'm also curious about the lines like this:
>     >
>     > 2014-09-30 12:49:20,469 WARN org.apache.commons.httpclient.SimpleHttpConnectionManager: SimpleHttpConnectionManager being used incorrectly.  Be sure that
>     HttpMethod.releaseConnection() is always called and that only one thread and/or method is using this connection
>     manager at a time.
>     >
>     >
>     > Would that be related to the timeout problem we're seeing?
>     >
>     > Thanks,
>     > Zach
>     >
>     > --
>     > You received this message because you are subscribed to the Google Groups "elasticsearch" group.
>     > To unsubscribe from this group and stop receiving emails from it, send an email to
>     >elasticsearc...@googlegroups.com <javascript:> <mailto:elasticsearch+unsubscribe@googlegroups.com <javascript:>
<javascript:>>.
>     > To view this discussion on the web visit
>     >https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com>
>     <https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com>>
>     > <https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer>

>     <https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer>>>.

>
>     > For more options, visithttps://groups.google.com/d/optout <http://groups.google.com/d/optout> <https://groups.google.com/d/optout
<https://groups.google.com/d/optout>>.
>     --
>     Costin
>
> --
> You received this message because you are subscribed to the Google Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
>elasticsearc...@googlegroups.com <javascript:> <mailto:elasticsearch+unsubscribe@googlegroups.com <javascript:>>.
> To view this discussion on the web visit
>https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com>
> <https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com?utm_medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com?utm_medium=email&utm_source=footer>>.

> For more options, visithttps://groups.google.com/d/optout <https://groups.google.com/d/optout>.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/90b059ef-680e-4b5c-a0c9-dc5e5038205a%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/90b059ef-680e-4b5c-a0c9-dc5e5038205a%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/542BF16B.6010103%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

This particular job has 1353 map tasks, Hadoop cluster has 5 nodes with
total map task capacity of 25. Elasticsearch cluster has 4 nodes.

Where can I find the bulk size/entries numbers?

Thanks,
Zach

On Wed, Oct 1, 2014 at 7:19 AM, Costin Leau costin.leau@gmail.com wrote:

The error indicates the ES nodes don't reply in a timely fashion and thus
the connection drops. Based on your logs it seems to be either a GC or a
network issue.
You could try turning on logging in package
'org.elasticsearch.hadoop.rest'
to DEBUG.
How many tasks do you have and what's your bulk size/entries number?

On 10/1/14 2:15 PM, Zach Cox wrote:

Hi Costin - we updated our dependencies to use elasticsearch-hadoop
2.0.2.BUILD-SNAPSHOT, but that didn't seem to change
anything. We're still seeing the same task failures while trying to write
to Elasticsearch. The only difference in the
logs is that now I don't see the SimpleHttpConnectionManager warnings.

Any ideas what we could try next?

Thanks,
Zach

On Tuesday, September 30, 2014 10:54:27 AM UTC-5, Costin Leau wrote:

Can you please try the 2.0.2.BUILD-SNAPSHOT? I think you might be

running into issue #256 which was fixed some time ago
and will be part of the upcoming
2.0.2, 2.1 Beta2.

Cheers,

On 9/30/14 6:43 PM, Zach Cox wrote:
> Hi Costin:
>
> elasticsearch-hadoop 2.0.0
> cascading 2.5.4
> scalding 0.10.0
>
> Thanks,
> Zach
>
>
> On Tuesday, September 30, 2014 10:25:10 AM UTC-5, Costin Leau

wrote:

>
>     What version of es-hadoop/es/cascading are you using?
>
>     On 9/30/14 6:16 PM, Zach Cox wrote:
>     > Hi - we're having problems with one of our map-reduce jobs

that writes to Elasticsearch. Lots of map tasks are failing
> > due to ES being "unavailable", with logs like this:
> >
>

https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt

<
https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt

>

<
https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt

<
https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt

>
>     >
>     > Seems like elasticsearch-hadoop tries talking to an ES node,

it times out, tries the next one, it times out, etc until
> > all nodes in the cluster are exhausted and then it gives up.
> >
> > As far as I can tell, the ES cluster is healthy while this is
occurring. May map tasks are succeeding - probably about
> > 10% of the attempts are killed due to this issue. The main
problem is that these killed tasks waste a lot of time, and
> > slow down the overall job execution.
> >
> > I'm not sure where to troubleshoot this next. Does anyone
have

any idea what would cause all of these time outs & failures?
> >
> > I'm also curious about the lines like this:
> >
> > 2014-09-30 12:49:20,469 WARN
org.apache.commons.httpclient.SimpleHttpConnectionManager:
SimpleHttpConnectionManager being used incorrectly. Be sure that
> HttpMethod.releaseConnection() is always called and that only
one thread and/or method is using this connection
> manager at a time.
> >
> >
> > Would that be related to the timeout problem we're seeing?
> >
> > Thanks,
> > Zach
> >
> > --
> > You received this message because you are subscribed to the
Google Groups "elasticsearch" group.
> > To unsubscribe from this group and stop receiving emails from
it, send an email to
> >elasticsearc...@googlegroups.com <javascript:>
<mailto:elasticsearch+unsubscribe@googlegroups.com <javascript:>
<javascript:>>.
> > To view this discussion on the web visit
>

https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com

>

<
https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com

>     >

<
https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer

<
https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer

>

<
https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer

<
https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer

.

>
>     > For more options, visithttps://groups.google.com/d/optout

http://groups.google.com/d/optout <https://groups.google.com/d/optout
https://groups.google.com/d/optout>.
> --
> Costin
>
> --
> You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it,
send an email to
>elasticsearc...@googlegroups.com <javascript:>
<mailto:elasticsearch+unsubscribe@googlegroups.com <javascript:>>.
> To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com

>

<
https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com?utm_medium=email&utm_source=footer

<
https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com?utm_medium=email&utm_source=footer

.

> For more options, visithttps://groups.google.com/d/optout

https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to
elasticsearch+unsubscribe@googlegroups.com
mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/90b059ef-680e-4b5c-a0c9-dc5e5038205a%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/90b059ef-680e-4b5c-a0c9-dc5e5038205a%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/BKR18lczF1w/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/542BF16B.6010103%40gmail.com
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CANjo42x_aAnkbFefsbrtxx8U_S2FAoMCY2esmsTSOPs9RfPR%2BA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi Costin - by "bulk size/entries number" are you referring to the
es.batch.size.bytes and es.batch.size.entries config values described here?

http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/master/configuration.html#configuration-serialization

It looks like the only elasticsearch-related config values we're setting is
this:

es.input.json = true

So we must be using default values for those es.batch.size config values.

Thanks,
Zach

On Wed, Oct 1, 2014 at 7:41 AM, Zach Cox zcox522@gmail.com wrote:

This particular job has 1353 map tasks, Hadoop cluster has 5 nodes with
total map task capacity of 25. Elasticsearch cluster has 4 nodes.

Where can I find the bulk size/entries numbers?

Thanks,
Zach

On Wed, Oct 1, 2014 at 7:19 AM, Costin Leau costin.leau@gmail.com wrote:

The error indicates the ES nodes don't reply in a timely fashion and thus
the connection drops. Based on your logs it seems to be either a GC or a
network issue.
You could try turning on logging in package
'org.elasticsearch.hadoop.rest'
to DEBUG.
How many tasks do you have and what's your bulk size/entries number?

On 10/1/14 2:15 PM, Zach Cox wrote:

Hi Costin - we updated our dependencies to use elasticsearch-hadoop
2.0.2.BUILD-SNAPSHOT, but that didn't seem to change
anything. We're still seeing the same task failures while trying to
write

to Elasticsearch. The only difference in the
logs is that now I don't see the SimpleHttpConnectionManager warnings.

Any ideas what we could try next?

Thanks,
Zach

On Tuesday, September 30, 2014 10:54:27 AM UTC-5, Costin Leau wrote:

Can you please try the 2.0.2.BUILD-SNAPSHOT? I think you might be

running into issue #256 which was fixed some time ago
and will be part of the upcoming
2.0.2, 2.1 Beta2.

Cheers,

On 9/30/14 6:43 PM, Zach Cox wrote:
> Hi Costin:
>
> elasticsearch-hadoop 2.0.0
> cascading 2.5.4
> scalding 0.10.0
>
> Thanks,
> Zach
>
>
> On Tuesday, September 30, 2014 10:25:10 AM UTC-5, Costin Leau

wrote:

>
>     What version of es-hadoop/es/cascading are you using?
>
>     On 9/30/14 6:16 PM, Zach Cox wrote:
>     > Hi - we're having problems with one of our map-reduce jobs

that writes to Elasticsearch. Lots of map tasks are failing
> > due to ES being "unavailable", with logs like this:
> >
>

https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt

<
https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt

>

<
https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt

<
https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt

>
>     >
>     > Seems like elasticsearch-hadoop tries talking to an ES node,

it times out, tries the next one, it times out, etc until
> > all nodes in the cluster are exhausted and then it gives up.
> >
> > As far as I can tell, the ES cluster is healthy while this
is

occurring. May map tasks are succeeding - probably about
> > 10% of the attempts are killed due to this issue. The main
problem is that these killed tasks waste a lot of time, and
> > slow down the overall job execution.
> >
> > I'm not sure where to troubleshoot this next. Does anyone
have

any idea what would cause all of these time outs & failures?
> >
> > I'm also curious about the lines like this:
> >
> > 2014-09-30 12:49:20,469 WARN
org.apache.commons.httpclient.SimpleHttpConnectionManager:
SimpleHttpConnectionManager being used incorrectly. Be sure that
> HttpMethod.releaseConnection() is always called and that only
one thread and/or method is using this connection
> manager at a time.
> >
> >
> > Would that be related to the timeout problem we're seeing?
> >
> > Thanks,
> > Zach
> >
> > --
> > You received this message because you are subscribed to the
Google Groups "elasticsearch" group.
> > To unsubscribe from this group and stop receiving emails
from

it, send an email to
> >elasticsearc...@googlegroups.com <javascript:>
<mailto:elasticsearch+unsubscribe@googlegroups.com <javascript:>
<javascript:>>.
> > To view this discussion on the web visit
>

https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com

>

<
https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com

>     >

<
https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer

<
https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer

>

<
https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer

<
https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer

.

>
>     > For more options, visithttps://groups.google.com/d/optout

http://groups.google.com/d/optout <https://groups.google.com/d/optout
https://groups.google.com/d/optout>.
> --
> Costin
>
> --
> You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it,
send an email to
>elasticsearc...@googlegroups.com <javascript:>
<mailto:elasticsearch+unsubscribe@googlegroups.com <javascript:>>.
> To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com

>

<
https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com?utm_medium=email&utm_source=footer

<
https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com?utm_medium=email&utm_source=footer

.

> For more options, visithttps://groups.google.com/d/optout

https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google
Groups

"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an

email to
elasticsearch+unsubscribe@googlegroups.com
mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/90b059ef-680e-4b5c-a0c9-dc5e5038205a%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/90b059ef-680e-4b5c-a0c9-dc5e5038205a%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/BKR18lczF1w/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/542BF16B.6010103%40gmail.com
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CANjo42wT_Govm3zL%2ByhGvO--wA1%2BJS2FPSgbxj5bfdcM5Aghgg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Is there anything else we could try here to debug elasticsearch-hadoop
being unable to write to Elasticsearch? We're still seeing the same number
of these fails during the nightly batch runs even after switching to
2.0.2.BUILD-SNAPSHOT,
and I don't see any additional lines from org.elasticsearch.hadoop.rest in
the tasktracker logs after adding to our logback.xml.

Here are example logs for a task that failed this morning:
https://gist.githubusercontent.com/zcox/c5c81f4f8ee26d7bedbf/raw/ac9bef7bfe3a8e1ef69a62aabe4c7c3983882f19/gistfile1.txt

Is it reasonable to expect 4 Elasticsearch nodes to handle the batch write
volume from 5 Hadoop nodes (25 map tasks)?

Thanks,
Zach

On Wed, Oct 1, 2014 at 8:00 AM, Zach Cox zcox522@gmail.com wrote:

Hi Costin - by "bulk size/entries number" are you referring to the
es.batch.size.bytes and es.batch.size.entries config values described here?

http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/master/configuration.html#configuration-serialization

It looks like the only elasticsearch-related config values we're setting
is this:

es.input.json = true

So we must be using default values for those es.batch.size config values.

Thanks,
Zach

On Wed, Oct 1, 2014 at 7:41 AM, Zach Cox zcox522@gmail.com wrote:

This particular job has 1353 map tasks, Hadoop cluster has 5 nodes with
total map task capacity of 25. Elasticsearch cluster has 4 nodes.

Where can I find the bulk size/entries numbers?

Thanks,
Zach

On Wed, Oct 1, 2014 at 7:19 AM, Costin Leau costin.leau@gmail.com
wrote:

The error indicates the ES nodes don't reply in a timely fashion and
thus
the connection drops. Based on your logs it seems to be either a GC or a
network issue.
You could try turning on logging in package
'org.elasticsearch.hadoop.rest'
to DEBUG.
How many tasks do you have and what's your bulk size/entries number?

On 10/1/14 2:15 PM, Zach Cox wrote:

Hi Costin - we updated our dependencies to use elasticsearch-hadoop
2.0.2.BUILD-SNAPSHOT, but that didn't seem to change
anything. We're still seeing the same task failures while trying to
write

to Elasticsearch. The only difference in the
logs is that now I don't see the SimpleHttpConnectionManager warnings.

Any ideas what we could try next?

Thanks,
Zach

On Tuesday, September 30, 2014 10:54:27 AM UTC-5, Costin Leau wrote:

Can you please try the 2.0.2.BUILD-SNAPSHOT? I think you might be

running into issue #256 which was fixed some time ago
and will be part of the upcoming
2.0.2, 2.1 Beta2.

Cheers,

On 9/30/14 6:43 PM, Zach Cox wrote:
> Hi Costin:
>
> elasticsearch-hadoop 2.0.0
> cascading 2.5.4
> scalding 0.10.0
>
> Thanks,
> Zach
>
>
> On Tuesday, September 30, 2014 10:25:10 AM UTC-5, Costin Leau

wrote:

>
>     What version of es-hadoop/es/cascading are you using?
>
>     On 9/30/14 6:16 PM, Zach Cox wrote:
>     > Hi - we're having problems with one of our map-reduce jobs

that writes to Elasticsearch. Lots of map tasks are failing
> > due to ES being "unavailable", with logs like this:
> >
>

https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt

<
https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt

>

<
https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt

<
https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt

>
>     >
>     > Seems like elasticsearch-hadoop tries talking to an ES

node,

it times out, tries the next one, it times out, etc until
> > all nodes in the cluster are exhausted and then it gives
up.

>     >
>     > As far as I can tell, the ES cluster is healthy while this

is

occurring. May map tasks are succeeding - probably about
> > 10% of the attempts are killed due to this issue. The main
problem is that these killed tasks waste a lot of time, and
> > slow down the overall job execution.
> >
> > I'm not sure where to troubleshoot this next. Does anyone
have

any idea what would cause all of these time outs & failures?
> >
> > I'm also curious about the lines like this:
> >
> > 2014-09-30 12:49:20,469 WARN
org.apache.commons.httpclient.SimpleHttpConnectionManager:
SimpleHttpConnectionManager being used incorrectly. Be sure that
> HttpMethod.releaseConnection() is always called and that only
one thread and/or method is using this connection
> manager at a time.
> >
> >
> > Would that be related to the timeout problem we're seeing?
> >
> > Thanks,
> > Zach
> >
> > --
> > You received this message because you are subscribed to the
Google Groups "elasticsearch" group.
> > To unsubscribe from this group and stop receiving emails
from

it, send an email to
> >elasticsearc...@googlegroups.com <javascript:>
<mailto:elasticsearch+unsubscribe@googlegroups.com <javascript:>
<javascript:>>.
> > To view this discussion on the web visit
>

https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com

>

<
https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com

>     >

<
https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer

<
https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer

>

<
https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer

<
https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer

.

>
>     > For more options, visithttps://groups.google.com/d/optout

http://groups.google.com/d/optout <
https://groups.google.com/d/optout

<https://groups.google.com/d/optout>>.
>     --
>     Costin
>
> --
> You received this message because you are subscribed to the

Google

Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it,
send an email to
>elasticsearc...@googlegroups.com <javascript:>
<mailto:elasticsearch+unsubscribe@googlegroups.com <javascript:>>.
> To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com

>

<
https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com?utm_medium=email&utm_source=footer

<
https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com?utm_medium=email&utm_source=footer

.

> For more options, visithttps://groups.google.com/d/optout

https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google
Groups

"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an

email to
elasticsearch+unsubscribe@googlegroups.com
mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/90b059ef-680e-4b5c-a0c9-dc5e5038205a%40googlegroups.com

<
https://groups.google.com/d/msgid/elasticsearch/90b059ef-680e-4b5c-a0c9-dc5e5038205a%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/BKR18lczF1w/unsubscribe
.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit

https://groups.google.com/d/msgid/elasticsearch/542BF16B.6010103%40gmail.com
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CANjo42zgBXS4Y%3DSoRG67Q%3DeqH4ctdJDbOnzFSR6PjVZvQr-A8w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

You can always enable TRACE though that is likely to create way too much information in production and slow things down
considerably.

The first thing you can do is minimize the batch size to give ES more breathing space by minimizing the batch size (say
to 512KB) or the number
of entries (500 instead of 1k, etc...).
However the error indicates a network issue not an Elasticsearch one so check whether there's some type of
service/network/firewall initialization happening at night. Do the errors occur around the same time? Is there some
backup procedure that potentially kicks in?

Potentially you can try and increase the default http timeout (es.http.timeout) from 1m to 3m or so. However this is
really a patch since if the ES server doesn't return a response in 1m, it means things are not going well at all.

On 10/3/14 6:09 PM, Zach Cox wrote:

Is there anything else we could try here to debug elasticsearch-hadoop being unable to write to Elasticsearch? We're
still seeing the same number of these fails during the nightly batch runs even after switching to 2.0.2.BUILD-SNAPSHOT,
and I don't see any additional lines from org.elasticsearch.hadoop.__rest in the tasktracker logs after adding to our logback.xml.

Here are example logs for a task that failed this morning:
https://gist.githubusercontent.com/zcox/c5c81f4f8ee26d7bedbf/raw/ac9bef7bfe3a8e1ef69a62aabe4c7c3983882f19/gistfile1.txt

Is it reasonable to expect 4 Elasticsearch nodes to handle the batch write volume from 5 Hadoop nodes (25 map tasks)?

Thanks,
Zach

On Wed, Oct 1, 2014 at 8:00 AM, Zach Cox <zcox522@gmail.com mailto:zcox522@gmail.com> wrote:

Hi Costin - by "bulk size/entries number" are you referring to the es.batch.size.bytes and es.batch.size.entries
config values described here?

http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/master/configuration.html#configuration-serialization

It looks like the only elasticsearch-related config values we're setting is this:

es.input.json = true

So we must be using default values for those es.batch.size config values.

Thanks,
Zach



On Wed, Oct 1, 2014 at 7:41 AM, Zach Cox <zcox522@gmail.com <mailto:zcox522@gmail.com>> wrote:

    This particular job has 1353 map tasks, Hadoop cluster has 5 nodes with total map task capacity of 25.
    Elasticsearch cluster has 4 nodes.

    Where can I find the bulk size/entries numbers?


    Thanks,
    Zach



    On Wed, Oct 1, 2014 at 7:19 AM, Costin Leau <costin.leau@gmail.com <mailto:costin.leau@gmail.com>> wrote:
    > The error indicates the ES nodes don't reply in a timely fashion and thus
    > the connection drops. Based on your logs it seems to be either a GC or a
    > network issue.
    > You could try turning on logging in package 'org.elasticsearch.hadoop.rest'
    > to DEBUG.
    > How many tasks do you have and what's your bulk size/entries number?
    >
    >
    > On 10/1/14 2:15 PM, Zach Cox wrote:
    >>
    >> Hi Costin - we updated our dependencies to use elasticsearch-hadoop
    >> 2.0.2.BUILD-SNAPSHOT, but that didn't seem to change
    >> anything. We're still seeing the same task failures while trying to write
    >> to Elasticsearch. The only difference in the
    >> logs is that now I don't see the SimpleHttpConnectionManager warnings.
    >>
    >> Any ideas what we could try next?
    >>
    >> Thanks,
    >> Zach
    >>
    >>
    >> On Tuesday, September 30, 2014 10:54:27 AM UTC-5, Costin Leau wrote:
    >>
    >>     Can you please try the 2.0.2.BUILD-SNAPSHOT? I think you might be
    >> running into issue #256 which was fixed some time ago
    >>     and will be part of the upcoming
    >>     2.0.2, 2.1 Beta2.
    >>
    >>     Cheers,
    >>
    >>     On 9/30/14 6:43 PM, Zach Cox wrote:
    >>     > Hi Costin:
    >>     >
    >>     > elasticsearch-hadoop 2.0.0
    >>     > cascading 2.5.4
    >>     > scalding 0.10.0
    >>     >
    >>     > Thanks,
    >>     > Zach
    >>     >
    >>     >
    >>     > On Tuesday, September 30, 2014 10:25:10 AM UTC-5, Costin Leau wrote:
    >>     >
    >>     >     What version of es-hadoop/es/cascading are you using?
    >>     >
    >>     >     On 9/30/14 6:16 PM, Zach Cox wrote:
    >>     >     > Hi - we're having problems with one of our map-reduce jobs
    >> that writes to Elasticsearch. Lots of map tasks are failing
    >>     >     > due to ES being "unavailable", with logs like this:
    >>     >     >
    >>     >
    >> >https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt
    >>
    >> <https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt>
    >>
    >>     >
    >> <https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt
    >>
    >> <https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt>>
    >>
    >>     >
    >>     >     >
    >>     >     > Seems like elasticsearch-hadoop tries talking to an ES node,
    >> it times out, tries the next one, it times out, etc until
    >>     >     > all nodes in the cluster are exhausted and then it gives up.
    >>     >     >
    >>     >     > As far as I can tell, the ES cluster is healthy while this is
    >> occurring. May map tasks are succeeding - probably about
    >>     >     > 10% of the attempts are killed due to this issue. The main
    >> problem is that these killed tasks waste a lot of time, and
    >>     >     > slow down the overall job execution.
    >>     >     >
    >>     >     > I'm not sure where to troubleshoot this next. Does anyone have
    >> any idea what would cause all of these time outs & failures?
    >>     >     >
    >>     >     > I'm also curious about the lines like this:
    >>     >     >
    >>     >     > 2014-09-30 12:49:20,469 WARN
    >> org.apache.commons.httpclient.SimpleHttpConnectionManager:
    >> SimpleHttpConnectionManager being used incorrectly.  Be sure that
    >>     >     HttpMethod.releaseConnection() is always called and that only
    >> one thread and/or method is using this connection
    >>     >     manager at a time.
    >>     >     >
    >>     >     >
    >>     >     > Would that be related to the timeout problem we're seeing?
    >>     >     >
    >>     >     > Thanks,
    >>     >     > Zach
    >>     >     >
    >>     >     > --
    >>     >     > You received this message because you are subscribed to the
    >> Google Groups "elasticsearch" group.
    >>     >     > To unsubscribe from this group and stop receiving emails from
    >> it, send an email to
    >>     >     >elasticsearc...@googlegroups.com <mailto:elasticsearc...@googlegroups.com> <javascript:>
    >> <mailto:elasticsearch+unsubscribe@googlegroups.com <mailto:elasticsearch%2Bunsubscribe@googlegroups.com> <javascript:>
    >>     <javascript:>>.
    >>     >     > To view this discussion on the web visit
    >>     >
    >> >https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com
    >>
    >> <https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com>
    >>     >
    >> <https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com
    >>
    >> <https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com>>
    >>     >     >
    >> <https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer
    >>
    >> <https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer>
    >>
    >>     >
    >> <https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer
    >>
    >> <https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer>>>.
    >>
    >>     >
    >>     >     > For more options, visithttps://groups.google.com/d/optout <http://groups.google.com/d/optout>
    >> <http://groups.google.com/d/optout> <https://groups.google.com/d/optout
    >>     <https://groups.google.com/d/optout>>.
    >>     >     --
    >>     >     Costin
    >>     >
    >>     > --
    >>     > You received this message because you are subscribed to the Google
    >> Groups "elasticsearch" group.
    >>     > To unsubscribe from this group and stop receiving emails from it,
    >> send an email to
    >>     >elasticsearc...@googlegroups.com <mailto:elasticsearc...@googlegroups.com> <javascript:>
    >> <mailto:elasticsearch+unsubscribe@googlegroups.com <mailto:elasticsearch%2Bunsubscribe@googlegroups.com> <javascript:>>.
    >>     > To view this discussion on the web visit
    >>
    >> >https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com
    >>
    >> <https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com>
    >>     >
    >> <https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com?utm_medium=email&utm_source=footer
    >>
    >> <https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com?utm_medium=email&utm_source=footer>>.
    >>
    >>     > For more options, visithttps://groups.google.com/d/optout <http://groups.google.com/d/optout>
    >> <https://groups.google.com/d/optout>.
    >>
    >>     --
    >>     Costin
    >>
    >> --
    >> You received this message because you are subscribed to the Google Groups
    >> "elasticsearch" group.
    >> To unsubscribe from this group and stop receiving emails from it, send an
    >> email to
    >>elasticsearch+unsubscribe@googlegroups.com <mailto:elasticsearch%2Bunsubscribe@googlegroups.com>
    >> <mailto:elasticsearch+unsubscribe@googlegroups.com <mailto:elasticsearch%2Bunsubscribe@googlegroups.com>>.
    >> To view this discussion on the web visit
    >>
    >>https://groups.google.com/d/msgid/elasticsearch/90b059ef-680e-4b5c-a0c9-dc5e5038205a%40googlegroups.com
    >>
    >> <https://groups.google.com/d/msgid/elasticsearch/90b059ef-680e-4b5c-a0c9-dc5e5038205a%40googlegroups.com?utm_medium=email&utm_source=footer>.
    >> For more options, visithttps://groups.google.com/d/optout.
    >
    >
    > --
    > Costin
    >
    > --
    > You received this message because you are subscribed to a topic in the
    > Google Groups "elasticsearch" group.
    > To unsubscribe from this topic, visit
    >https://groups.google.com/d/topic/elasticsearch/BKR18lczF1w/unsubscribe.
    > To unsubscribe from this group and all its topics, send an email to
    >elasticsearch+unsubscribe@googlegroups.com <mailto:elasticsearch%2Bunsubscribe@googlegroups.com>.
    > To view this discussion on the web visit
    >https://groups.google.com/d/msgid/elasticsearch/542BF16B.6010103%40gmail.com.
    >
    > For more options, visithttps://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CANjo42zgBXS4Y%3DSoRG67Q%3DeqH4ctdJDbOnzFSR6PjVZvQr-A8w%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CANjo42zgBXS4Y%3DSoRG67Q%3DeqH4ctdJDbOnzFSR6PjVZvQr-A8w%40mail.gmail.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/542EC4BA.60204%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Our Hadoop and Elasticsearch are all on AWS. We have 2 MR jobs that write
to ES - 1 of them works fine, and one of them takes forever due to 10-20%
of tasks failing in the way I've described. So I don't think it's any kind
of network/firewall issue. There are no nightly backups related to ES or
anything.

Would 25 map tasks batch writing to a 4-node Elasticsearch cluster cause it
so much pain that it stops responding to writes? If I ssh to the ES nodes
while this job is running and some tasks are failing, it doesn't seem like
ES is under much stress.

Thanks,
Zach

On Fri, Oct 3, 2014 at 10:46 AM, Costin Leau costin.leau@gmail.com wrote:

You can always enable TRACE though that is likely to create way too much
information in production and slow things down considerably.

The first thing you can do is minimize the batch size to give ES more
breathing space by minimizing the batch size (say to 512KB) or the number
of entries (500 instead of 1k, etc...).
However the error indicates a network issue not an Elasticsearch one so
check whether there's some type of service/network/firewall initialization
happening at night. Do the errors occur around the same time? Is there some
backup procedure that potentially kicks in?

Potentially you can try and increase the default http timeout
(es.http.timeout) from 1m to 3m or so. However this is really a patch since
if the ES server doesn't return a response in 1m, it means things are not
going well at all.

On 10/3/14 6:09 PM, Zach Cox wrote:

Is there anything else we could try here to debug elasticsearch-hadoop
being unable to write to Elasticsearch? We're
still seeing the same number of these fails during the nightly batch runs
even after switching to 2.0.2.BUILD-SNAPSHOT,
and I don't see any additional lines from org.elasticsearch.hadoop.__rest
in the tasktracker logs after adding to our logback.xml.

Here are example logs for a task that failed this morning:
https://gist.githubusercontent.com/zcox/c5c81f4f8ee26d7bedbf/raw/
ac9bef7bfe3a8e1ef69a62aabe4c7c3983882f19/gistfile1.txt

Is it reasonable to expect 4 Elasticsearch nodes to handle the batch
write volume from 5 Hadoop nodes (25 map tasks)?

Thanks,
Zach

On Wed, Oct 1, 2014 at 8:00 AM, Zach Cox <zcox522@gmail.com <mailto:
zcox522@gmail.com>> wrote:

Hi Costin - by "bulk size/entries number" are you referring to the

es.batch.size.bytes and es.batch.size.entries
config values described here?

http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/

master/configuration.html#configuration-serialization

It looks like the only elasticsearch-related config values we're

setting is this:

es.input.json = true

So we must be using default values for those es.batch.size config

values.

Thanks,
Zach



On Wed, Oct 1, 2014 at 7:41 AM, Zach Cox <zcox522@gmail.com <mailto:

zcox522@gmail.com>> wrote:

    This particular job has 1353 map tasks, Hadoop cluster has 5

nodes with total map task capacity of 25.
Elasticsearch cluster has 4 nodes.

    Where can I find the bulk size/entries numbers?


    Thanks,
    Zach



    On Wed, Oct 1, 2014 at 7:19 AM, Costin Leau <

costin.leau@gmail.com mailto:costin.leau@gmail.com> wrote:
> The error indicates the ES nodes don't reply in a timely
fashion and thus
> the connection drops. Based on your logs it seems to be either
a GC or a
> network issue.
> You could try turning on logging in package
'org.elasticsearch.hadoop.rest'
> to DEBUG.
> How many tasks do you have and what's your bulk size/entries
number?
>
>
> On 10/1/14 2:15 PM, Zach Cox wrote:
>>
>> Hi Costin - we updated our dependencies to use
elasticsearch-hadoop
>> 2.0.2.BUILD-SNAPSHOT, but that didn't seem to change
>> anything. We're still seeing the same task failures while
trying to write
>> to Elasticsearch. The only difference in the
>> logs is that now I don't see the SimpleHttpConnectionManager
warnings.
>>
>> Any ideas what we could try next?
>>
>> Thanks,
>> Zach
>>
>>
>> On Tuesday, September 30, 2014 10:54:27 AM UTC-5, Costin Leau
wrote:
>>
>> Can you please try the 2.0.2.BUILD-SNAPSHOT? I think you
might be
>> running into issue #256 which was fixed some time ago
>> and will be part of the upcoming
>> 2.0.2, 2.1 Beta2.
>>
>> Cheers,
>>
>> On 9/30/14 6:43 PM, Zach Cox wrote:
>> > Hi Costin:
>> >
>> > elasticsearch-hadoop 2.0.0
>> > cascading 2.5.4
>> > scalding 0.10.0
>> >
>> > Thanks,
>> > Zach
>> >
>> >
>> > On Tuesday, September 30, 2014 10:25:10 AM UTC-5, Costin
Leau wrote:
>> >
>> > What version of es-hadoop/es/cascading are you using?
>> >
>> > On 9/30/14 6:16 PM, Zach Cox wrote:
>> > > Hi - we're having problems with one of our
map-reduce jobs
>> that writes to Elasticsearch. Lots of map tasks are failing
>> > > due to ES being "unavailable", with logs like this:
>> > >
>> >
>> >https://gist.githubusercontent.com/zcox/
3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f84
7492d16c0d/gistfile1.txt
>>
>> <https://gist.githubusercontent.com/zcox/
3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f84
7492d16c0d/gistfile1.txt>
>>
>> >
>> <https://gist.githubusercontent.com/zcox/
3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f84
7492d16c0d/gistfile1.txt
>>
>> <https://gist.githubusercontent.com/zcox/
3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f84
7492d16c0d/gistfile1.txt>>
>>
>> >
>> > >
>> > > Seems like elasticsearch-hadoop tries talking to
an ES node,
>> it times out, tries the next one, it times out, etc until
>> > > all nodes in the cluster are exhausted and then it
gives up.
>> > >
>> > > As far as I can tell, the ES cluster is healthy
while this is
>> occurring. May map tasks are succeeding - probably about
>> > > 10% of the attempts are killed due to this issue.
The main
>> problem is that these killed tasks waste a lot of time, and
>> > > slow down the overall job execution.
>> > >
>> > > I'm not sure where to troubleshoot this next. Does
anyone have
>> any idea what would cause all of these time outs & failures?
>> > >
>> > > I'm also curious about the lines like this:
>> > >
>> > > 2014-09-30 12:49:20,469 WARN
>> org.apache.commons.httpclient.SimpleHttpConnectionManager:
>> SimpleHttpConnectionManager being used incorrectly. Be sure
that
>> > HttpMethod.releaseConnection() is always called and
that only
>> one thread and/or method is using this connection
>> > manager at a time.
>> > >
>> > >
>> > > Would that be related to the timeout problem we're
seeing?
>> > >
>> > > Thanks,
>> > > Zach
>> > >
>> > > --
>> > > You received this message because you are
subscribed to the
>> Google Groups "elasticsearch" group.
>> > > To unsubscribe from this group and stop receiving
emails from
>> it, send an email to
>> > >elasticsearc...@googlegroups.com <mailto:
elasticsearc...@googlegroups.com> <javascript:>
>> <mailto:elasticsearch+unsubscribe@googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com> <javascript:>
>> <javascript:>>.
>> > > To view this discussion on the web visit
>> >
>> >https://groups.google.com/d/msgid/elasticsearch/f304a286-
399f-4dea-b7f0-032b19ad67e6%40googlegroups.com
>>
>> <https://groups.google.com/d/msgid/elasticsearch/f304a286-
399f-4dea-b7f0-032b19ad67e6%40googlegroups.com>
>> >
>> <https://groups.google.com/d/msgid/elasticsearch/f304a286-
399f-4dea-b7f0-032b19ad67e6%40googlegroups.com
>>
>> <https://groups.google.com/d/msgid/elasticsearch/f304a286-
399f-4dea-b7f0-032b19ad67e6%40googlegroups.com>>
>> > >
>> <https://groups.google.com/d/msgid/elasticsearch/f304a286-
399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=
email&utm_source=footer
>>
>> <https://groups.google.com/d/msgid/elasticsearch/f304a286-
399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=
email&utm_source=footer>
>>
>> >
>> <https://groups.google.com/d/msgid/elasticsearch/f304a286-
399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=
email&utm_source=footer
>>
>> <https://groups.google.com/d/msgid/elasticsearch/f304a286-
399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=
email&utm_source=footer>>>.
>>
>> >
>> > > For more options, visithttps://groups.google.
com/d/optout http://groups.google.com/d/optout
>> http://groups.google.com/d/optout <
https://groups.google.com/d/optout
>> https://groups.google.com/d/optout>.
>> > --
>> > Costin
>> >
>> > --
>> > You received this message because you are subscribed to
the Google
>> Groups "elasticsearch" group.
>> > To unsubscribe from this group and stop receiving emails
from it,
>> send an email to
>> >elasticsearc...@googlegroups.com <mailto:elasticsearc...@
googlegroups.com> <javascript:>
>> <mailto:elasticsearch+unsubscribe@googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com> <javascript:>>.
>> > To view this discussion on the web visit
>>
>> >https://groups.google.com/d/msgid/elasticsearch/034d651e-
8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com
>>
>> <https://groups.google.com/d/msgid/elasticsearch/034d651e-
8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com>
>> >
>> <https://groups.google.com/d/msgid/elasticsearch/034d651e-
8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com?utm_medium=
email&utm_source=footer
>>
>> <https://groups.google.com/d/msgid/elasticsearch/034d651e-
8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com?utm_medium=
email&utm_source=footer>>.
>>
>> > For more options, visithttps://groups.google.
com/d/optout http://groups.google.com/d/optout
>> https://groups.google.com/d/optout.
>>
>> --
>> Costin
>>
>> --
>> You received this message because you are subscribed to the
Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from
it, send an
>> email to
>>elasticsearch+unsubscribe@googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com>
>> <mailto:elasticsearch+unsubscribe@googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com>>.
>> To view this discussion on the web visit
>>
>>https://groups.google.com/d/msgid/elasticsearch/90b059ef-
680e-4b5c-a0c9-dc5e5038205a%40googlegroups.com
>>
>> <https://groups.google.com/d/msgid/elasticsearch/90b059ef-
680e-4b5c-a0c9-dc5e5038205a%40googlegroups.com?utm_medium=
email&utm_source=footer>.
>> For more options, visithttps://groups.google.com/d/optout.
>
>
> --
> Costin
>
> --
> You received this message because you are subscribed to a topic
in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
>https://groups.google.com/d/topic/elasticsearch/
BKR18lczF1w/unsubscribe.
> To unsubscribe from this group and all its topics, send an
email to
>elasticsearch+unsubscribe@googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com>.
> To view this discussion on the web visit
>https://groups.google.com/d/msgid/elasticsearch/542BF16B.
6010103%40gmail.com.
>
> For more options, visithttps://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to
elasticsearch+unsubscribe@googlegroups.com <mailto:elasticsearch+
unsubscribe@googlegroups.com>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CANjo42zgBXS4Y%3DSoRG67Q%
3DeqH4ctdJDbOnzFSR6PjVZvQr-A8w%40mail.gmail.com
<https://groups.google.com/d/msgid/elasticsearch/
CANjo42zgBXS4Y%3DSoRG67Q%3DeqH4ctdJDbOnzFSR6PjVZvQr-
A8w%40mail.gmail.com?utm_medium=email&utm_source=footer>.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/
topic/elasticsearch/BKR18lczF1w/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/542EC4BA.60204%40gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CANjo42yKPnbH6R%3DrzM%3DrRBo%3DUqcD%2BbOy8%3D8y5Xo-L2pT8MgtQA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

What type of AWS instances are you using? Virtualization tends to interfere in various ways with a running system -
sometime for good, sometimes for worse.

The number of tasks is good to compute the total number of data and entries you are throwing at ES at one time. You are
looking at a maximum of 25 MBs or 25K entries against the 4 node Elastic. Depending on how beefy the machines are and
what type of storage and mapping you have, it might be quick or slow - we can only guess. Try using Marvel or other
monitoring tools so you can see whether there's load building up or if anything unusual happens.

Since it's unclear what the issue might be, take baby steps [1] and start with minimal load (smaller bulk size + less
tasks) see whether there are any issues and keep on going.

[1] http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/master/troubleshooting.html

On 10/3/14 6:59 PM, Zach Cox wrote:

Our Hadoop and Elasticsearch are all on AWS. We have 2 MR jobs that write to ES - 1 of them works fine, and one of them
takes forever due to 10-20% of tasks failing in the way I've described. So I don't think it's any kind of
network/firewall issue. There are no nightly backups related to ES or anything.

Would 25 map tasks batch writing to a 4-node Elasticsearch cluster cause it so much pain that it stops responding to
writes? If I ssh to the ES nodes while this job is running and some tasks are failing, it doesn't seem like ES is under
much stress.

Thanks,
Zach

On Fri, Oct 3, 2014 at 10:46 AM, Costin Leau <costin.leau@gmail.com mailto:costin.leau@gmail.com> wrote:

You can always enable TRACE though that is likely to create way too much information in production and slow things
down considerably.

The first thing you can do is minimize the batch size to give ES more breathing space by minimizing the batch size
(say to 512KB) or the number
of entries (500 instead of 1k, etc...).
However the error indicates a network issue not an Elasticsearch one so check whether there's some type of
service/network/firewall initialization happening at night. Do the errors occur around the same time? Is there some
backup procedure that potentially kicks in?

Potentially you can try and increase the default http timeout (es.http.timeout) from 1m to 3m or so. However this is
really a patch since if the ES server doesn't return a response in 1m, it means things are not going well at all.

On 10/3/14 6:09 PM, Zach Cox wrote:

    Is there anything else we could try here to debug elasticsearch-hadoop being unable to write to Elasticsearch? We're
    still seeing the same number of these fails during the nightly batch runs even after switching to
    2.0.2.BUILD-SNAPSHOT,
    and I don't see any additional lines from org.elasticsearch.hadoop.____rest in the tasktracker logs after adding
    <logger
    name="org.elasticsearch.__hadoop.rest" level="DEBUG"/> to our logback.xml.

    Here are example logs for a task that failed this morning:
    https://gist.__githubusercontent.com/zcox/__c5c81f4f8ee26d7bedbf/raw/__ac9bef7bfe3a8e1ef69a62aabe4c7c__3983882f19/gistfile1.txt
    <https://gist.githubusercontent.com/zcox/c5c81f4f8ee26d7bedbf/raw/ac9bef7bfe3a8e1ef69a62aabe4c7c3983882f19/gistfile1.txt>

    Is it reasonable to expect 4 Elasticsearch nodes to handle the batch write volume from 5 Hadoop nodes (25 map
    tasks)?

    Thanks,
    Zach


    On Wed, Oct 1, 2014 at 8:00 AM, Zach Cox <zcox522@gmail.com <mailto:zcox522@gmail.com> <mailto:zcox522@gmail.com
    <mailto:zcox522@gmail.com>>> wrote:

         Hi Costin - by "bulk size/entries number" are you referring to the es.batch.size.bytes and
    es.batch.size.entries
         config values described here?

    http://www.elasticsearch.org/__guide/en/elasticsearch/hadoop/__master/configuration.html#__configuration-serialization
    <http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/master/configuration.html#configuration-serialization>

         It looks like the only elasticsearch-related config values we're setting is this:

         es.input.json = true

         So we must be using default values for those es.batch.size config values.

         Thanks,
         Zach



         On Wed, Oct 1, 2014 at 7:41 AM, Zach Cox <zcox522@gmail.com <mailto:zcox522@gmail.com>
    <mailto:zcox522@gmail.com <mailto:zcox522@gmail.com>>> wrote:

             This particular job has 1353 map tasks, Hadoop cluster has 5 nodes with total map task capacity of 25.
             Elasticsearch cluster has 4 nodes.

             Where can I find the bulk size/entries numbers?


             Thanks,
             Zach



             On Wed, Oct 1, 2014 at 7:19 AM, Costin Leau <costin.leau@gmail.com <mailto:costin.leau@gmail.com>
    <mailto:costin.leau@gmail.com <mailto:costin.leau@gmail.com>>__> wrote:
             > The error indicates the ES nodes don't reply in a timely fashion and thus
             > the connection drops. Based on your logs it seems to be either a GC or a
             > network issue.
             > You could try turning on logging in package 'org.elasticsearch.hadoop.__rest'
             > to DEBUG.
             > How many tasks do you have and what's your bulk size/entries number?
             >
             >
             > On 10/1/14 2:15 PM, Zach Cox wrote:
             >>
             >> Hi Costin - we updated our dependencies to use elasticsearch-hadoop
             >> 2.0.2.BUILD-SNAPSHOT, but that didn't seem to change
             >> anything. We're still seeing the same task failures while trying to write
             >> to Elasticsearch. The only difference in the
             >> logs is that now I don't see the SimpleHttpConnectionManager warnings.
             >>
             >> Any ideas what we could try next?
             >>
             >> Thanks,
             >> Zach
             >>
             >>
             >> On Tuesday, September 30, 2014 10:54:27 AM UTC-5, Costin Leau wrote:
             >>
             >>     Can you please try the 2.0.2.BUILD-SNAPSHOT? I think you might be
             >> running into issue #256 which was fixed some time ago
             >>     and will be part of the upcoming
             >>     2.0.2, 2.1 Beta2.
             >>
             >>     Cheers,
             >>
             >>     On 9/30/14 6:43 PM, Zach Cox wrote:
             >>     > Hi Costin:
             >>     >
             >>     > elasticsearch-hadoop 2.0.0
             >>     > cascading 2.5.4
             >>     > scalding 0.10.0
             >>     >
             >>     > Thanks,
             >>     > Zach
             >>     >
             >>     >
             >>     > On Tuesday, September 30, 2014 10:25:10 AM UTC-5, Costin Leau wrote:
             >>     >
             >>     >     What version of es-hadoop/es/cascading are you using?
             >>     >
             >>     >     On 9/30/14 6:16 PM, Zach Cox wrote:
             >>     >     > Hi - we're having problems with one of our map-reduce jobs
             >> that writes to Elasticsearch. Lots of map tasks are failing
             >>     >     > due to ES being "unavailable", with logs like this:
             >>     >     >
             >>     >
             >>
     >https://gist.__githubusercontent.com/zcox/__3d6cf4329d49ca03271b/raw/__57c46a5e4c9ea04d5c4209414d6f84__7492d16c0d/gistfile1.txt <https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt>
             >>
             >>
    <https://gist.__githubusercontent.com/zcox/__3d6cf4329d49ca03271b/raw/__57c46a5e4c9ea04d5c4209414d6f84__7492d16c0d/gistfile1.txt
    <https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt>>
             >>
             >>     >
             >>
    <https://gist.__githubusercontent.com/zcox/__3d6cf4329d49ca03271b/raw/__57c46a5e4c9ea04d5c4209414d6f84__7492d16c0d/gistfile1.txt
    <https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt>
             >>
             >>
    <https://gist.__githubusercontent.com/zcox/__3d6cf4329d49ca03271b/raw/__57c46a5e4c9ea04d5c4209414d6f84__7492d16c0d/gistfile1.txt
    <https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt>>>
             >>
             >>     >
             >>     >     >
             >>     >     > Seems like elasticsearch-hadoop tries talking to an ES node,
             >> it times out, tries the next one, it times out, etc until
             >>     >     > all nodes in the cluster are exhausted and then it gives up.
             >>     >     >
             >>     >     > As far as I can tell, the ES cluster is healthy while this is
             >> occurring. May map tasks are succeeding - probably about
             >>     >     > 10% of the attempts are killed due to this issue. The main
             >> problem is that these killed tasks waste a lot of time, and
             >>     >     > slow down the overall job execution.
             >>     >     >
             >>     >     > I'm not sure where to troubleshoot this next. Does anyone have
             >> any idea what would cause all of these time outs & failures?
             >>     >     >
             >>     >     > I'm also curious about the lines like this:
             >>     >     >
             >>     >     > 2014-09-30 12:49:20,469 WARN
             >> org.apache.commons.httpclient.__SimpleHttpConnectionManager:
             >> SimpleHttpConnectionManager being used incorrectly.  Be sure that
             >>     >     HttpMethod.releaseConnection() is always called and that only
             >> one thread and/or method is using this connection
             >>     >     manager at a time.
             >>     >     >
             >>     >     >
             >>     >     > Would that be related to the timeout problem we're seeing?
             >>     >     >
             >>     >     > Thanks,
             >>     >     > Zach
             >>     >     >
             >>     >     > --
             >>     >     > You received this message because you are subscribed to the
             >> Google Groups "elasticsearch" group.
             >>     >     > To unsubscribe from this group and stop receiving emails from
             >> it, send an email to
             >>     >     >elasticsearc...@googlegroups.__com <mailto:elasticsearc...@googlegroups.com>
    <mailto:elasticsearc...@__googlegroups.com <mailto:elasticsearc...@googlegroups.com>> <javascript:>
             >> <mailto:elasticsearch+__unsubscribe@googlegroups.com
    <mailto:elasticsearch%2Bunsubscribe@googlegroups.com> <mailto:elasticsearch%__2Bunsubscribe@googlegroups.com
    <mailto:elasticsearch%252Bunsubscribe@googlegroups.com>__> <javascript:>
             >>     <javascript:>>.
             >>     >     > To view this discussion on the web visit
             >>     >
             >>
     >https://groups.google.com/d/__msgid/elasticsearch/f304a286-__399f-4dea-b7f0-032b19ad67e6%__40googlegroups.com
    <https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com>
             >>
             >>
    <https://groups.google.com/d/__msgid/elasticsearch/f304a286-__399f-4dea-b7f0-032b19ad67e6%__40googlegroups.com
    <https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com>>
             >>     >
             >>
    <https://groups.google.com/d/__msgid/elasticsearch/f304a286-__399f-4dea-b7f0-032b19ad67e6%__40googlegroups.com
    <https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com>
             >>
             >>
    <https://groups.google.com/d/__msgid/elasticsearch/f304a286-__399f-4dea-b7f0-032b19ad67e6%__40googlegroups.com
    <https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com>>>
             >>     >     >
             >>
    <https://groups.google.com/d/__msgid/elasticsearch/f304a286-__399f-4dea-b7f0-032b19ad67e6%__40googlegroups.com?utm_medium=__email&utm_source=footer
    <https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer>
             >>
             >>
    <https://groups.google.com/d/__msgid/elasticsearch/f304a286-__399f-4dea-b7f0-032b19ad67e6%__40googlegroups.com?utm_medium=__email&utm_source=footer
    <https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer>>
             >>
             >>     >
             >>
    <https://groups.google.com/d/__msgid/elasticsearch/f304a286-__399f-4dea-b7f0-032b19ad67e6%__40googlegroups.com?utm_medium=__email&utm_source=footer
    <https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer>
             >>
             >>
    <https://groups.google.com/d/__msgid/elasticsearch/f304a286-__399f-4dea-b7f0-032b19ad67e6%__40googlegroups.com?utm_medium=__email&utm_source=footer
    <https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=email&utm_source=footer>>>>.
             >>
             >>     >
             >>     >     > For more options, visithttps://groups.google.__com/d/optout
    <http://groups.google.com/d/optout> <http://groups.google.com/d/__optout <http://groups.google.com/d/optout>>
             >> <http://groups.google.com/d/__optout <http://groups.google.com/d/optout>>
    <https://groups.google.com/d/__optout <https://groups.google.com/d/optout>
             >>     <https://groups.google.com/d/__optout <https://groups.google.com/d/optout>>>.
             >>     >     --
             >>     >     Costin
             >>     >
             >>     > --
             >>     > You received this message because you are subscribed to the Google
             >> Groups "elasticsearch" group.
             >>     > To unsubscribe from this group and stop receiving emails from it,
             >> send an email to
             >>     >elasticsearc...@googlegroups.__com <mailto:elasticsearc...@googlegroups.com>
    <mailto:elasticsearc...@__googlegroups.com <mailto:elasticsearc...@googlegroups.com>> <javascript:>
             >> <mailto:elasticsearch+__unsubscribe@googlegroups.com
    <mailto:elasticsearch%2Bunsubscribe@googlegroups.com> <mailto:elasticsearch%__2Bunsubscribe@googlegroups.com
    <mailto:elasticsearch%252Bunsubscribe@googlegroups.com>__> <javascript:>>.
             >>     > To view this discussion on the web visit
             >>
             >>
     >https://groups.google.com/d/__msgid/elasticsearch/034d651e-__8562-4dde-bbb9-b3fef6d0d0b9%__40googlegroups.com
    <https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com>
             >>
             >>
    <https://groups.google.com/d/__msgid/elasticsearch/034d651e-__8562-4dde-bbb9-b3fef6d0d0b9%__40googlegroups.com
    <https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com>>
             >>     >
             >>
    <https://groups.google.com/d/__msgid/elasticsearch/034d651e-__8562-4dde-bbb9-b3fef6d0d0b9%__40googlegroups.com?utm_medium=__email&utm_source=footer
    <https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com?utm_medium=email&utm_source=footer>
             >>
             >>
    <https://groups.google.com/d/__msgid/elasticsearch/034d651e-__8562-4dde-bbb9-b3fef6d0d0b9%__40googlegroups.com?utm_medium=__email&utm_source=footer
    <https://groups.google.com/d/msgid/elasticsearch/034d651e-8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com?utm_medium=email&utm_source=footer>>>.
             >>
             >>     > For more options, visithttps://groups.google.__com/d/optout
    <http://groups.google.com/d/optout> <http://groups.google.com/d/__optout <http://groups.google.com/d/optout>>
             >> <https://groups.google.com/d/__optout <https://groups.google.com/d/optout>>.
             >>
             >>     --
             >>     Costin
             >>
             >> --
             >> You received this message because you are subscribed to the Google Groups
             >> "elasticsearch" group.
             >> To unsubscribe from this group and stop receiving emails from it, send an
             >> email to
             >>elasticsearch+unsubscribe@__googlegroups.com <mailto:elasticsearch%2Bunsubscribe@googlegroups.com>
    <mailto:elasticsearch%__2Bunsubscribe@googlegroups.com <mailto:elasticsearch%252Bunsubscribe@googlegroups.com>__>
             >> <mailto:elasticsearch+__unsubscribe@googlegroups.com
    <mailto:elasticsearch%2Bunsubscribe@googlegroups.com> <mailto:elasticsearch%__2Bunsubscribe@googlegroups.com
    <mailto:elasticsearch%252Bunsubscribe@googlegroups.com>__>>.
             >> To view this discussion on the web visit
             >>

     >>https://groups.google.com/d/__msgid/elasticsearch/90b059ef-__680e-4b5c-a0c9-dc5e5038205a%__40googlegroups.com
    <https://groups.google.com/d/msgid/elasticsearch/90b059ef-680e-4b5c-a0c9-dc5e5038205a%40googlegroups.com>
             >>
             >>
    <https://groups.google.com/d/__msgid/elasticsearch/90b059ef-__680e-4b5c-a0c9-dc5e5038205a%__40googlegroups.com?utm_medium=__email&utm_source=footer
    <https://groups.google.com/d/msgid/elasticsearch/90b059ef-680e-4b5c-a0c9-dc5e5038205a%40googlegroups.com?utm_medium=email&utm_source=footer>>.
             >> For more options, visithttps://groups.google.__com/d/optout <http://groups.google.com/d/optout>.
             >
             >
             > --
             > Costin
             >
             > --
             > You received this message because you are subscribed to a topic in the
             > Google Groups "elasticsearch" group.
             > To unsubscribe from this topic, visit
             >https://groups.google.com/d/__topic/elasticsearch/__BKR18lczF1w/unsubscribe
    <https://groups.google.com/d/topic/elasticsearch/BKR18lczF1w/unsubscribe>.
             > To unsubscribe from this group and all its topics, send an email to
             >elasticsearch+unsubscribe@__googlegroups.com <mailto:elasticsearch%2Bunsubscribe@googlegroups.com>
    <mailto:elasticsearch%__2Bunsubscribe@googlegroups.com <mailto:elasticsearch%252Bunsubscribe@googlegroups.com>__>.
             > To view this discussion on the web visit
             >https://groups.google.com/d/__msgid/elasticsearch/542BF16B.__6010103%40gmail.com
    <https://groups.google.com/d/msgid/elasticsearch/542BF16B.6010103%40gmail.com>.
             >
             > For more options, visithttps://groups.google.__com/d/optout <http://groups.google.com/d/optout>.



    --
    You received this message because you are subscribed to the Google Groups "elasticsearch" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to
    elasticsearch+unsubscribe@__googlegroups.com <mailto:elasticsearch%2Bunsubscribe@googlegroups.com>
    <mailto:elasticsearch+__unsubscribe@googlegroups.com <mailto:elasticsearch%2Bunsubscribe@googlegroups.com>>.
    To view this discussion on the web visit
    https://groups.google.com/d/__msgid/elasticsearch/__CANjo42zgBXS4Y%3DSoRG67Q%__3DeqH4ctdJDbOnzFSR6PjVZvQr-__A8w%40mail.gmail.com
    <https://groups.google.com/d/msgid/elasticsearch/CANjo42zgBXS4Y%3DSoRG67Q%3DeqH4ctdJDbOnzFSR6PjVZvQr-A8w%40mail.gmail.com>
    <https://groups.google.com/d/__msgid/elasticsearch/__CANjo42zgBXS4Y%3DSoRG67Q%__3DeqH4ctdJDbOnzFSR6PjVZvQr-__A8w%40mail.gmail.com?utm___medium=email&utm_source=footer
    <https://groups.google.com/d/msgid/elasticsearch/CANjo42zgBXS4Y%3DSoRG67Q%3DeqH4ctdJDbOnzFSR6PjVZvQr-A8w%40mail.gmail.com?utm_medium=email&utm_source=footer>__>.
    For more options, visit https://groups.google.com/d/__optout <https://groups.google.com/d/optout>.


--
Costin

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/__topic/elasticsearch/__BKR18lczF1w/unsubscribe
<https://groups.google.com/d/topic/elasticsearch/BKR18lczF1w/unsubscribe>.
To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscribe@__googlegroups.com
<mailto:elasticsearch%2Bunsubscribe@googlegroups.com>.
To view this discussion on the web visit
https://groups.google.com/d/__msgid/elasticsearch/542EC4BA.__60204%40gmail.com
<https://groups.google.com/d/msgid/elasticsearch/542EC4BA.60204%40gmail.com>.

For more options, visit https://groups.google.com/d/__optout <https://groups.google.com/d/optout>.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CANjo42yKPnbH6R%3DrzM%3DrRBo%3DUqcD%2BbOy8%3D8y5Xo-L2pT8MgtQA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CANjo42yKPnbH6R%3DrzM%3DrRBo%3DUqcD%2BbOy8%3D8y5Xo-L2pT8MgtQA%40mail.gmail.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.
--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/542ECE86.4090001%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Our 4 ES nodes are all m1.large (
http://www.ec2instances.info/?filter=m1.large) and our 5 Hadoop nodes are
all m1.xlarge (http://www.ec2instances.info/?filter=m1.xlarge).

Thanks for the troubleshooting pointers - we'll do some more research.

On Fri, Oct 3, 2014 at 11:27 AM, Costin Leau costin.leau@gmail.com wrote:

What type of AWS instances are you using? Virtualization tends to
interfere in various ways with a running system - sometime for good,
sometimes for worse.

The number of tasks is good to compute the total number of data and
entries you are throwing at ES at one time. You are looking at a maximum of
25 MBs or 25K entries against the 4 node Elastic. Depending on how beefy
the machines are and what type of storage and mapping you have, it might be
quick or slow - we can only guess. Try using Marvel or other monitoring
tools so you can see whether there's load building up or if anything
unusual happens.

Since it's unclear what the issue might be, take baby steps [1] and start
with minimal load (smaller bulk size + less tasks) see whether there are
any issues and keep on going.

[1] http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/
master/troubleshooting.html

On 10/3/14 6:59 PM, Zach Cox wrote:

Our Hadoop and Elasticsearch are all on AWS. We have 2 MR jobs that write
to ES - 1 of them works fine, and one of them
takes forever due to 10-20% of tasks failing in the way I've described.
So I don't think it's any kind of
network/firewall issue. There are no nightly backups related to ES or
anything.

Would 25 map tasks batch writing to a 4-node Elasticsearch cluster cause
it so much pain that it stops responding to
writes? If I ssh to the ES nodes while this job is running and some tasks
are failing, it doesn't seem like ES is under
much stress.

Thanks,
Zach

On Fri, Oct 3, 2014 at 10:46 AM, Costin Leau <costin.leau@gmail.com
mailto:costin.leau@gmail.com> wrote:

You can always enable TRACE though that is likely to create way too

much information in production and slow things
down considerably.

The first thing you can do is minimize the batch size to give ES more

breathing space by minimizing the batch size
(say to 512KB) or the number
of entries (500 instead of 1k, etc...).
However the error indicates a network issue not an Elasticsearch one
so check whether there's some type of
service/network/firewall initialization happening at night. Do the
errors occur around the same time? Is there some
backup procedure that potentially kicks in?

Potentially you can try and increase the default http timeout

(es.http.timeout) from 1m to 3m or so. However this is
really a patch since if the ES server doesn't return a response in
1m, it means things are not going well at all.

On 10/3/14 6:09 PM, Zach Cox wrote:

    Is there anything else we could try here to debug

elasticsearch-hadoop being unable to write to Elasticsearch? We're
still seeing the same number of these fails during the nightly
batch runs even after switching to
2.0.2.BUILD-SNAPSHOT,
and I don't see any additional lines from
org.elasticsearch.hadoop.____rest in the tasktracker logs after adding
to our
logback.xml.

    Here are example logs for a task that failed this morning:
    https://gist.__githubusercontent.com/zcox/__

c5c81f4f8ee26d7bedbf/raw/__ac9bef7bfe3a8e1ef69a62aabe4c7c
__3983882f19/gistfile1.txt
<https://gist.githubusercontent.com/zcox/
c5c81f4f8ee26d7bedbf/raw/ac9bef7bfe3a8e1ef69a62aabe4c7c
3983882f19/gistfile1.txt>

    Is it reasonable to expect 4 Elasticsearch nodes to handle the

batch write volume from 5 Hadoop nodes (25 map
tasks)?

    Thanks,
    Zach


    On Wed, Oct 1, 2014 at 8:00 AM, Zach Cox <zcox522@gmail.com

mailto:zcox522@gmail.com <mailto:zcox522@gmail.com
mailto:zcox522@gmail.com>> wrote:

         Hi Costin - by "bulk size/entries number" are you referring

to the es.batch.size.bytes and
es.batch.size.entries
config values described here?

    http://www.elasticsearch.org/__guide/en/elasticsearch/

hadoop/__master/configuration.html#__configuration-serialization
<http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/
master/configuration.html#configuration-serialization>

         It looks like the only elasticsearch-related config values

we're setting is this:

         es.input.json = true

         So we must be using default values for those es.batch.size

config values.

         Thanks,
         Zach



         On Wed, Oct 1, 2014 at 7:41 AM, Zach Cox <zcox522@gmail.com

mailto:zcox522@gmail.com
<mailto:zcox522@gmail.com mailto:zcox522@gmail.com>> wrote:

             This particular job has 1353 map tasks, Hadoop cluster

has 5 nodes with total map task capacity of 25.
Elasticsearch cluster has 4 nodes.

             Where can I find the bulk size/entries numbers?


             Thanks,
             Zach



             On Wed, Oct 1, 2014 at 7:19 AM, Costin Leau <

costin.leau@gmail.com mailto:costin.leau@gmail.com
<mailto:costin.leau@gmail.com mailto:costin.leau@gmail.com>__>
wrote:
> The error indicates the ES nodes don't reply in a
timely fashion and thus
> the connection drops. Based on your logs it seems to
be either a GC or a
> network issue.
> You could try turning on logging in package
'org.elasticsearch.hadoop.__rest'

             > to DEBUG.
             > How many tasks do you have and what's your bulk

size/entries number?
>
>
> On 10/1/14 2:15 PM, Zach Cox wrote:
>>
>> Hi Costin - we updated our dependencies to use
elasticsearch-hadoop
>> 2.0.2.BUILD-SNAPSHOT, but that didn't seem to change
>> anything. We're still seeing the same task failures
while trying to write
>> to Elasticsearch. The only difference in the
>> logs is that now I don't see the
SimpleHttpConnectionManager warnings.
>>
>> Any ideas what we could try next?
>>
>> Thanks,
>> Zach
>>
>>
>> On Tuesday, September 30, 2014 10:54:27 AM UTC-5,
Costin Leau wrote:
>>
>> Can you please try the 2.0.2.BUILD-SNAPSHOT? I
think you might be
>> running into issue #256 which was fixed some time ago
>> and will be part of the upcoming
>> 2.0.2, 2.1 Beta2.
>>
>> Cheers,
>>
>> On 9/30/14 6:43 PM, Zach Cox wrote:
>> > Hi Costin:
>> >
>> > elasticsearch-hadoop 2.0.0
>> > cascading 2.5.4
>> > scalding 0.10.0
>> >
>> > Thanks,
>> > Zach
>> >
>> >
>> > On Tuesday, September 30, 2014 10:25:10 AM
UTC-5, Costin Leau wrote:
>> >
>> > What version of es-hadoop/es/cascading are
you using?
>> >
>> > On 9/30/14 6:16 PM, Zach Cox wrote:
>> > > Hi - we're having problems with one of
our map-reduce jobs
>> that writes to Elasticsearch. Lots of map tasks are
failing
>> > > due to ES being "unavailable", with logs
like this:
>> > >
>> >
>>
>https://gist.githubusercontent.com/zcox/
3d6cf4329d49ca03271b/raw/__57c46a5e4c9ea04d5c4209414d6f84
__7492d16c0d/gistfile1.txt <https://gist.githubusercontent.com/zcox/
3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f84
7492d16c0d/gistfile1.txt>
>>
>>
<https://gist.githubusercontent.com/zcox/
3d6cf4329d49ca03271b/raw/__57c46a5e4c9ea04d5c4209414d6f84
__7492d16c0d/gistfile1.txt
<https://gist.githubusercontent.com/zcox/
3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f84
7492d16c0d/gistfile1.txt>>
>>
>> >
>>
<https://gist.githubusercontent.com/zcox/
3d6cf4329d49ca03271b/raw/__57c46a5e4c9ea04d5c4209414d6f84
__7492d16c0d/gistfile1.txt
<https://gist.githubusercontent.com/zcox/
3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f84
7492d16c0d/gistfile1.txt>
>>
>>
<https://gist.githubusercontent.com/zcox/
3d6cf4329d49ca03271b/raw/__57c46a5e4c9ea04d5c4209414d6f84
__7492d16c0d/gistfile1.txt
<https://gist.githubusercontent.com/zcox/
3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f84
7492d16c0d/gistfile1.txt>>>
>>
>> >
>> > >
>> > > Seems like elasticsearch-hadoop tries
talking to an ES node,
>> it times out, tries the next one, it times out, etc
until
>> > > all nodes in the cluster are exhausted
and then it gives up.
>> > >
>> > > As far as I can tell, the ES cluster is
healthy while this is
>> occurring. May map tasks are succeeding - probably
about
>> > > 10% of the attempts are killed due to
this issue. The main
>> problem is that these killed tasks waste a lot of
time, and
>> > > slow down the overall job execution.
>> > >
>> > > I'm not sure where to troubleshoot this
next. Does anyone have
>> any idea what would cause all of these time outs &
failures?
>> > >
>> > > I'm also curious about the lines like
this:
>> > >
>> > > 2014-09-30 12:49:20,469 WARN
>> org.apache.commons.httpclient.
__SimpleHttpConnectionManager:
>> SimpleHttpConnectionManager being used incorrectly.
Be sure that
>> > HttpMethod.releaseConnection() is always
called and that only
>> one thread and/or method is using this connection
>> > manager at a time.
>> > >
>> > >
>> > > Would that be related to the timeout
problem we're seeing?
>> > >
>> > > Thanks,
>> > > Zach
>> > >
>> > > --
>> > > You received this message because you are
subscribed to the
>> Google Groups "elasticsearch" group.
>> > > To unsubscribe from this group and stop
receiving emails from
>> it, send an email to
>> > >elasticsearc...@googlegroups.__com
mailto:elasticsearc...@googlegroups.com
<mailto:elasticsearc...@__googlegroups.com <mailto:
elasticsearc...@googlegroups.com>> <javascript:>
>> <mailto:elasticsearch+__unsubscribe@googlegroups.com
mailto:elasticsearch%2Bunsubscribe@googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com
mailto:elasticsearch%2Bunsubscribe@googlegroups.com
>
<javascript:>
>> <javascript:>>.
>> > > To view this discussion on the web visit
>> >
>>
>https://groups.google.com/d/__msgid/elasticsearch/f304a286-
__399f-4dea-b7f0-032b19ad67e6%__40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/f304a286-
399f-4dea-b7f0-032b19ad67e6%40googlegroups.com>
>>
>>
<https://groups.google.com/d/__msgid/elasticsearch/f304a286-
__399f-4dea-b7f0-032b19ad67e6%__40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/f304a286-
399f-4dea-b7f0-032b19ad67e6%40googlegroups.com>>
>> >
>>
<https://groups.google.com/d/__msgid/elasticsearch/f304a286-
__399f-4dea-b7f0-032b19ad67e6%__40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/f304a286-
399f-4dea-b7f0-032b19ad67e6%40googlegroups.com>
>>
>>
<https://groups.google.com/d/__msgid/elasticsearch/f304a286-
__399f-4dea-b7f0-032b19ad67e6%__40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/f304a286-
399f-4dea-b7f0-032b19ad67e6%40googlegroups.com>>>
>> > >
>>
<https://groups.google.com/d/__msgid/elasticsearch/f304a286-
__399f-4dea-b7f0-032b19ad67e6%_40googlegroups.com?utm
medium=__email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/f304a286-
399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=
email&utm_source=footer>
>>
>>
<https://groups.google.com/d/__msgid/elasticsearch/f304a286-
__399f-4dea-b7f0-032b19ad67e6%_40googlegroups.com?utm
medium=__email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/f304a286-
399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=
email&utm_source=footer>>
>>
>> >
>>
<https://groups.google.com/d/__msgid/elasticsearch/f304a286-
__399f-4dea-b7f0-032b19ad67e6%_40googlegroups.com?utm
medium=__email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/f304a286-
399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=
email&utm_source=footer>
>>
>>
<https://groups.google.com/d/__msgid/elasticsearch/f304a286-
__399f-4dea-b7f0-032b19ad67e6%_40googlegroups.com?utm
medium=__email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/f304a286-
399f-4dea-b7f0-032b19ad67e6%40googlegroups.com?utm_medium=
email&utm_source=footer>>>>.
>>
>> >
>> > > For more options,
visithttps://groups.google.__com/d/optout
http://groups.google.com/d/optout <
http://groups.google.com/d/__optout http://groups.google.com/d/optout>
>> <http://groups.google.com/d/__optout <
http://groups.google.com/d/optout>>
<https://groups.google.com/d/__optout <
https://groups.google.com/d/optout>
>> <https://groups.google.com/d/__optout <
https://groups.google.com/d/optout>>>.
>> > --
>> > Costin
>> >
>> > --
>> > You received this message because you are
subscribed to the Google
>> Groups "elasticsearch" group.
>> > To unsubscribe from this group and stop
receiving emails from it,
>> send an email to
>> >elasticsearc...@googlegroups.__com <mailto:
elasticsearc...@googlegroups.com>
<mailto:elasticsearc...@__googlegroups.com <mailto:
elasticsearc...@googlegroups.com>> <javascript:>
>> <mailto:elasticsearch+__unsubscribe@googlegroups.com
mailto:elasticsearch%2Bunsubscribe@googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com
mailto:elasticsearch%2Bunsubscribe@googlegroups.com
>
<javascript:>>.
>> > To view this discussion on the web visit
>>
>>
>https://groups.google.com/d/__msgid/elasticsearch/034d651e-
__8562-4dde-bbb9-b3fef6d0d0b9%__40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/034d651e-
8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com>
>>
>>
<https://groups.google.com/d/__msgid/elasticsearch/034d651e-
__8562-4dde-bbb9-b3fef6d0d0b9%__40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/034d651e-
8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com>>
>> >
>>
<https://groups.google.com/d/__msgid/elasticsearch/034d651e-
__8562-4dde-bbb9-b3fef6d0d0b9%_40googlegroups.com?utm
medium=__email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/034d651e-
8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com?utm_medium=
email&utm_source=footer>
>>
>>
<https://groups.google.com/d/__msgid/elasticsearch/034d651e-
__8562-4dde-bbb9-b3fef6d0d0b9%_40googlegroups.com?utm
medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/034d651e-
8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com?utm_medium=
email&utm_source=footer>>>.
>>
>> > For more options, visithttps://groups.google.

com/d/optout
http://groups.google.com/d/optout <
http://groups.google.com/d/__optout http://groups.google.com/d/optout>
>> <https://groups.google.com/d/__optout <
https://groups.google.com/d/optout>>.
>>
>> --
>> Costin
>>
>> --
>> You received this message because you are subscribed
to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving
emails from it, send an
>> email to
>>elasticsearch+unsubscribe@__googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com>
<mailto:elasticsearch%2Bunsubscribe@googlegroups.com <mailto:
elasticsearch%252Bunsubscribe@googlegroups.com>
>
>> <mailto:elasticsearch+__unsubscribe@googlegroups.com
mailto:elasticsearch%2Bunsubscribe@googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com
mailto:elasticsearch%2Bunsubscribe@googlegroups.com
>>.
>> To view this discussion on the web visit
>>

     >>https://groups.google.com/d/__msgid/elasticsearch/

90b059ef-__680e-4b5c-a0c9-dc5e5038205a%__40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/90b059ef-
680e-4b5c-a0c9-dc5e5038205a%40googlegroups.com>
>>
>>
<https://groups.google.com/d/__msgid/elasticsearch/90b059ef-
__680e-4b5c-a0c9-dc5e5038205a%_40googlegroups.com?utm
medium=__email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/90b059ef-
680e-4b5c-a0c9-dc5e5038205a%40googlegroups.com?utm_medium=
email&utm_source=footer>>.
>> For more options, visithttps://groups.google.__com/d/optout
http://groups.google.com/d/optout.
>
>
> --
> Costin
>
> --
> You received this message because you are subscribed
to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
>https://groups.google.com/d/topic/elasticsearch/
BKR18lczF1w/unsubscribe
<https://groups.google.com/d/topic/elasticsearch/
BKR18lczF1w/unsubscribe>.
> To unsubscribe from this group and all its topics,
send an email to
>elasticsearch+unsubscribe@_googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com>
<mailto:elasticsearch%2Bunsubscribe@googlegroups.com <mailto:
elasticsearch%252Bunsubscribe@googlegroups.com>
>.
> To view this discussion on the web visit
>https://groups.google.com/d/

_msgid/elasticsearch/542BF16B.__6010103%40gmail.com
<https://groups.google.com/d/msgid/elasticsearch/542BF16B.
6010103%40gmail.com>.
>
> For more options, visithttps://groups.google.__com/d/optout
http://groups.google.com/d/optout.

    --
    You received this message because you are subscribed to the

Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to
elasticsearch+unsubscribe@__googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com>
<mailto:elasticsearch+unsubscribe@googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com>>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/
CANjo42zgBXS4Y%3DSoRG67Q%3DeqH4ctdJDbOnzFSR6PjVZvQr-
A8w%40mail.gmail.com
<https://groups.google.com/d/msgid/elasticsearch/
CANjo42zgBXS4Y%3DSoRG67Q%3DeqH4ctdJDbOnzFSR6PjVZvQr-A8w%40mail.gmail.com>
<https://groups.google.com/d/msgid/elasticsearch/
CANjo42zgBXS4Y%3DSoRG67Q%3DeqH4ctdJDbOnzFSR6PjVZvQr-
A8w%40mail.gmail.com?utm___medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/
CANjo42zgBXS4Y%3DSoRG67Q%3DeqH4ctdJDbOnzFSR6PjVZvQr-
A8w%40mail.gmail.com?utm_medium=email&utm_source=footer>
>.
For more options, visit https://groups.google.com/d/__optout <
https://groups.google.com/d/optout>.

--
Costin

--
You received this message because you are subscribed to a topic in

the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/__
topic/elasticsearch/__BKR18lczF1w/unsubscribe
<https://groups.google.com/d/topic/elasticsearch/
BKR18lczF1w/unsubscribe>.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@__googlegroups.com
mailto:elasticsearch%2Bunsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/_msgid/elasticsearch/542EC4BA.
_60204%40gmail.com
<https://groups.google.com/d/msgid/elasticsearch/542EC4BA.
60204%40gmail.com>.

For more options, visit https://groups.google.com/d/__optout <

https://groups.google.com/d/optout>.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to
elasticsearch+unsubscribe@googlegroups.com <mailto:elasticsearch+
unsubscribe@googlegroups.com>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/
CANjo42yKPnbH6R%3DrzM%3DrRBo%3DUqcD%2BbOy8%3D8y5Xo-
L2pT8MgtQA%40mail.gmail.com
<https://groups.google.com/d/msgid/elasticsearch/
CANjo42yKPnbH6R%3DrzM%3DrRBo%3DUqcD%2BbOy8%3D8y5Xo-
L2pT8MgtQA%40mail.gmail.com?utm_medium=email&utm_source=footer>.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/
topic/elasticsearch/BKR18lczF1w/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/542ECE86.4090001%40gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CANjo42z%2BJau8LFcOROa15xN2nOUzyr7PqK9Ky%2BqkkCN-6JjkbQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.