Our Hadoop and Elasticsearch are all on AWS. We have 2 MR jobs that write
to ES - 1 of them works fine, and one of them
takes forever due to 10-20% of tasks failing in the way I've described.
So I don't think it's any kind of
network/firewall issue. There are no nightly backups related to ES or
anything.
Would 25 map tasks batch writing to a 4-node Elasticsearch cluster cause
it so much pain that it stops responding to
writes? If I ssh to the ES nodes while this job is running and some tasks
are failing, it doesn't seem like ES is under
much stress.
Thanks,
Zach
On Fri, Oct 3, 2014 at 10:46 AM, Costin Leau <costin.leau@gmail.com
mailto:costin.leau@gmail.com> wrote:
You can always enable TRACE though that is likely to create way too
much information in production and slow things
down considerably.
The first thing you can do is minimize the batch size to give ES more
breathing space by minimizing the batch size
(say to 512KB) or the number
of entries (500 instead of 1k, etc...).
However the error indicates a network issue not an Elasticsearch one
so check whether there's some type of
service/network/firewall initialization happening at night. Do the
errors occur around the same time? Is there some
backup procedure that potentially kicks in?
Potentially you can try and increase the default http timeout
(es.http.timeout) from 1m to 3m or so. However this is
really a patch since if the ES server doesn't return a response in
1m, it means things are not going well at all.
On 10/3/14 6:09 PM, Zach Cox wrote:
Is there anything else we could try here to debug
elasticsearch-hadoop being unable to write to Elasticsearch? We're
still seeing the same number of these fails during the nightly
batch runs even after switching to
2.0.2.BUILD-SNAPSHOT,
and I don't see any additional lines from
org.elasticsearch.hadoop.____rest in the tasktracker logs after adding
to our
logback.xml.
Here are example logs for a task that failed this morning:
https://gist.__githubusercontent.com/zcox/__
c5c81f4f8ee26d7bedbf/raw/__ac9bef7bfe3a8e1ef69a62aabe4c7c
__3983882f19/gistfile1.txt
<zcox’s gists · GitHub
c5c81f4f8ee26d7bedbf/raw/ac9bef7bfe3a8e1ef69a62aabe4c7c
3983882f19/gistfile1.txt>
Is it reasonable to expect 4 Elasticsearch nodes to handle the
batch write volume from 5 Hadoop nodes (25 map
tasks)?
Thanks,
Zach
On Wed, Oct 1, 2014 at 8:00 AM, Zach Cox <zcox522@gmail.com
mailto:zcox522@gmail.com <mailto:zcox522@gmail.com
mailto:zcox522@gmail.com>> wrote:
Hi Costin - by "bulk size/entries number" are you referring
to the es.batch.size.bytes and
es.batch.size.entries
config values described here?
http://www.elasticsearch.org/__guide/en/elasticsearch/
hadoop/__master/configuration.html#__configuration-serialization
<Elasticsearch Platform — Find real-time answers at scale | Elastic
master/configuration.html#configuration-serialization>
It looks like the only elasticsearch-related config values
we're setting is this:
es.input.json = true
So we must be using default values for those es.batch.size
config values.
Thanks,
Zach
On Wed, Oct 1, 2014 at 7:41 AM, Zach Cox <zcox522@gmail.com
mailto:zcox522@gmail.com
<mailto:zcox522@gmail.com mailto:zcox522@gmail.com>> wrote:
This particular job has 1353 map tasks, Hadoop cluster
has 5 nodes with total map task capacity of 25.
Elasticsearch cluster has 4 nodes.
Where can I find the bulk size/entries numbers?
Thanks,
Zach
On Wed, Oct 1, 2014 at 7:19 AM, Costin Leau <
costin.leau@gmail.com mailto:costin.leau@gmail.com
<mailto:costin.leau@gmail.com mailto:costin.leau@gmail.com>__>
wrote:
> The error indicates the ES nodes don't reply in a
timely fashion and thus
> the connection drops. Based on your logs it seems to
be either a GC or a
> network issue.
> You could try turning on logging in package
'org.elasticsearch.hadoop.__rest'
> to DEBUG.
> How many tasks do you have and what's your bulk
size/entries number?
>
>
> On 10/1/14 2:15 PM, Zach Cox wrote:
>>
>> Hi Costin - we updated our dependencies to use
elasticsearch-hadoop
>> 2.0.2.BUILD-SNAPSHOT, but that didn't seem to change
>> anything. We're still seeing the same task failures
while trying to write
>> to Elasticsearch. The only difference in the
>> logs is that now I don't see the
SimpleHttpConnectionManager warnings.
>>
>> Any ideas what we could try next?
>>
>> Thanks,
>> Zach
>>
>>
>> On Tuesday, September 30, 2014 10:54:27 AM UTC-5,
Costin Leau wrote:
>>
>> Can you please try the 2.0.2.BUILD-SNAPSHOT? I
think you might be
>> running into issue #256 which was fixed some time ago
>> and will be part of the upcoming
>> 2.0.2, 2.1 Beta2.
>>
>> Cheers,
>>
>> On 9/30/14 6:43 PM, Zach Cox wrote:
>> > Hi Costin:
>> >
>> > elasticsearch-hadoop 2.0.0
>> > cascading 2.5.4
>> > scalding 0.10.0
>> >
>> > Thanks,
>> > Zach
>> >
>> >
>> > On Tuesday, September 30, 2014 10:25:10 AM
UTC-5, Costin Leau wrote:
>> >
>> > What version of es-hadoop/es/cascading are
you using?
>> >
>> > On 9/30/14 6:16 PM, Zach Cox wrote:
>> > > Hi - we're having problems with one of
our map-reduce jobs
>> that writes to Elasticsearch. Lots of map tasks are
failing
>> > > due to ES being "unavailable", with logs
like this:
>> > >
>> >
>>
>https://gist.githubusercontent.com/zcox/
3d6cf4329d49ca03271b/raw/__57c46a5e4c9ea04d5c4209414d6f84
__7492d16c0d/gistfile1.txt <zcox’s gists · GitHub
3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f84
7492d16c0d/gistfile1.txt>
>>
>>
<https://gist.githubusercontent.com/zcox/
3d6cf4329d49ca03271b/raw/__57c46a5e4c9ea04d5c4209414d6f84
__7492d16c0d/gistfile1.txt
<zcox’s gists · GitHub
3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f84
7492d16c0d/gistfile1.txt>>
>>
>> >
>>
<https://gist.githubusercontent.com/zcox/
3d6cf4329d49ca03271b/raw/__57c46a5e4c9ea04d5c4209414d6f84
__7492d16c0d/gistfile1.txt
<zcox’s gists · GitHub
3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f84
7492d16c0d/gistfile1.txt>
>>
>>
<https://gist.githubusercontent.com/zcox/
3d6cf4329d49ca03271b/raw/__57c46a5e4c9ea04d5c4209414d6f84
__7492d16c0d/gistfile1.txt
<zcox’s gists · GitHub
3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f84
7492d16c0d/gistfile1.txt>>>
>>
>> >
>> > >
>> > > Seems like elasticsearch-hadoop tries
talking to an ES node,
>> it times out, tries the next one, it times out, etc
until
>> > > all nodes in the cluster are exhausted
and then it gives up.
>> > >
>> > > As far as I can tell, the ES cluster is
healthy while this is
>> occurring. May map tasks are succeeding - probably
about
>> > > 10% of the attempts are killed due to
this issue. The main
>> problem is that these killed tasks waste a lot of
time, and
>> > > slow down the overall job execution.
>> > >
>> > > I'm not sure where to troubleshoot this
next. Does anyone have
>> any idea what would cause all of these time outs &
failures?
>> > >
>> > > I'm also curious about the lines like
this:
>> > >
>> > > 2014-09-30 12:49:20,469 WARN
>> org.apache.commons.httpclient.
__SimpleHttpConnectionManager:
>> SimpleHttpConnectionManager being used incorrectly.
Be sure that
>> > HttpMethod.releaseConnection() is always
called and that only
>> one thread and/or method is using this connection
>> > manager at a time.
>> > >
>> > >
>> > > Would that be related to the timeout
problem we're seeing?
>> > >
>> > > Thanks,
>> > > Zach
>> > >
>> > > --
>> > > You received this message because you are
subscribed to the
>> Google Groups "elasticsearch" group.
>> > > To unsubscribe from this group and stop
receiving emails from
>> it, send an email to
>> > >elasticsearc...@googlegroups.__com
mailto:elasticsearc...@googlegroups.com
<mailto:elasticsearc...@__googlegroups.com <mailto:
elasticsearc...@googlegroups.com>> <javascript:>
>> <mailto:elasticsearch+__unsubscribe@googlegroups.com
mailto:elasticsearch%2Bunsubscribe@googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com
mailto:elasticsearch%252Bunsubscribe@googlegroups.com>
<javascript:>
>> <javascript:>>.
>> > > To view this discussion on the web visit
>> >
>>
>https://groups.google.com/d/__msgid/elasticsearch/f304a286-
__399f-4dea-b7f0-032b19ad67e6%__40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/f304a286-
399f-4dea-b7f0-032b19ad67e6%40googlegroups.com>
>>
>>
<https://groups.google.com/d/__msgid/elasticsearch/f304a286-
__399f-4dea-b7f0-032b19ad67e6%__40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/f304a286-
399f-4dea-b7f0-032b19ad67e6%40googlegroups.com>>
>> >
>>
<https://groups.google.com/d/__msgid/elasticsearch/f304a286-
__399f-4dea-b7f0-032b19ad67e6%__40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/f304a286-
399f-4dea-b7f0-032b19ad67e6%40googlegroups.com>
>>
>>
<https://groups.google.com/d/__msgid/elasticsearch/f304a286-
__399f-4dea-b7f0-032b19ad67e6%__40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/f304a286-
399f-4dea-b7f0-032b19ad67e6%40googlegroups.com>>>
>> > >
>>
<https://groups.google.com/d/__msgid/elasticsearch/f304a286-
__399f-4dea-b7f0-032b19ad67e6%_40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
medium=__email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/f304a286-
399f-4dea-b7f0-032b19ad67e6%40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
email&utm_source=footer>
>>
>>
<https://groups.google.com/d/__msgid/elasticsearch/f304a286-
__399f-4dea-b7f0-032b19ad67e6%_40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
medium=__email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/f304a286-
399f-4dea-b7f0-032b19ad67e6%40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
email&utm_source=footer>>
>>
>> >
>>
<https://groups.google.com/d/__msgid/elasticsearch/f304a286-
__399f-4dea-b7f0-032b19ad67e6%_40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
medium=__email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/f304a286-
399f-4dea-b7f0-032b19ad67e6%40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
email&utm_source=footer>
>>
>>
<https://groups.google.com/d/__msgid/elasticsearch/f304a286-
__399f-4dea-b7f0-032b19ad67e6%_40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
medium=__email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/f304a286-
399f-4dea-b7f0-032b19ad67e6%40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
email&utm_source=footer>>>>.
>>
>> >
>> > > For more options,
visithttps://groups.google.__com/d/optout
http://groups.google.com/d/optout <
http://groups.google.com/d/__optout http://groups.google.com/d/optout>
>> <http://groups.google.com/d/__optout <
http://groups.google.com/d/optout>>
<https://groups.google.com/d/__optout <
https://groups.google.com/d/optout>
>> <https://groups.google.com/d/__optout <
https://groups.google.com/d/optout>>>.
>> > --
>> > Costin
>> >
>> > --
>> > You received this message because you are
subscribed to the Google
>> Groups "elasticsearch" group.
>> > To unsubscribe from this group and stop
receiving emails from it,
>> send an email to
>> >elasticsearc...@googlegroups.__com <mailto:
elasticsearc...@googlegroups.com>
<mailto:elasticsearc...@__googlegroups.com <mailto:
elasticsearc...@googlegroups.com>> <javascript:>
>> <mailto:elasticsearch+__unsubscribe@googlegroups.com
mailto:elasticsearch%2Bunsubscribe@googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com
mailto:elasticsearch%252Bunsubscribe@googlegroups.com>
<javascript:>>.
>> > To view this discussion on the web visit
>>
>>
>https://groups.google.com/d/__msgid/elasticsearch/034d651e-
__8562-4dde-bbb9-b3fef6d0d0b9%__40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/034d651e-
8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com>
>>
>>
<https://groups.google.com/d/__msgid/elasticsearch/034d651e-
__8562-4dde-bbb9-b3fef6d0d0b9%__40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/034d651e-
8562-4dde-bbb9-b3fef6d0d0b9%40googlegroups.com>>
>> >
>>
<https://groups.google.com/d/__msgid/elasticsearch/034d651e-
__8562-4dde-bbb9-b3fef6d0d0b9%_40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
medium=__email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/034d651e-
8562-4dde-bbb9-b3fef6d0d0b9%40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
email&utm_source=footer>
>>
>>
<https://groups.google.com/d/__msgid/elasticsearch/034d651e-
__8562-4dde-bbb9-b3fef6d0d0b9%_40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/034d651e-
8562-4dde-bbb9-b3fef6d0d0b9%40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
email&utm_source=footer>>>.
>>
>> > For more options, visithttps://groups.google.
com/d/optout
http://groups.google.com/d/optout <
http://groups.google.com/d/__optout http://groups.google.com/d/optout>
>> <https://groups.google.com/d/__optout <
https://groups.google.com/d/optout>>.
>>
>> --
>> Costin
>>
>> --
>> You received this message because you are subscribed
to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving
emails from it, send an
>> email to
>>elasticsearch+unsubscribe@__googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com>
<mailto:elasticsearch%2Bunsubscribe@googlegroups.com <mailto:
elasticsearch%252Bunsubscribe@googlegroups.com>>
>> <mailto:elasticsearch+__unsubscribe@googlegroups.com
mailto:elasticsearch%2Bunsubscribe@googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com
mailto:elasticsearch%252Bunsubscribe@googlegroups.com>>.
>> To view this discussion on the web visit
>>
>>https://groups.google.com/d/__msgid/elasticsearch/
90b059ef-__680e-4b5c-a0c9-dc5e5038205a%__40googlegroups.com
<https://groups.google.com/d/msgid/elasticsearch/90b059ef-
680e-4b5c-a0c9-dc5e5038205a%40googlegroups.com>
>>
>>
<https://groups.google.com/d/__msgid/elasticsearch/90b059ef-
__680e-4b5c-a0c9-dc5e5038205a%_40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
medium=__email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/90b059ef-
680e-4b5c-a0c9-dc5e5038205a%40GGGROUPS CASINO – Real Slot Casino for 10,000+ Senior Players
email&utm_source=footer>>.
>> For more options, visithttps://groups.google.__com/d/optout
http://groups.google.com/d/optout.
>
>
> --
> Costin
>
> --
> You received this message because you are subscribed
to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
>https://groups.google.com/d/__topic/elasticsearch/__
BKR18lczF1w/unsubscribe
<https://groups.google.com/d/topic/elasticsearch/
BKR18lczF1w/unsubscribe>.
> To unsubscribe from this group and all its topics,
send an email to
>elasticsearch+unsubscribe@__googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com>
<mailto:elasticsearch%2Bunsubscribe@googlegroups.com <mailto:
elasticsearch%252Bunsubscribe@googlegroups.com>>.
> To view this discussion on the web visit
>https://groups.google.com/d/_
_msgid/elasticsearch/542BF16B.__6010103%40gmail.com
<https://groups.google.com/d/msgid/elasticsearch/542BF16B.
6010103%40gmail.com>.
>
> For more options, visithttps://groups.google.__com/d/optout
http://groups.google.com/d/optout.
--
You received this message because you are subscribed to the
Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to
elasticsearch+unsubscribe@__googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com>
<mailto:elasticsearch+unsubscribe@googlegroups.com <mailto:
elasticsearch%2Bunsubscribe@googlegroups.com>>.
To view this discussion on the web visit
https://groups.google.com/d/__msgid/elasticsearch/__
CANjo42zgBXS4Y%3DSoRG67Q%3DeqH4ctdJDbOnzFSR6PjVZvQr-
A8w%40mail.gmail.com
<https://groups.google.com/d/msgid/elasticsearch/
CANjo42zgBXS4Y%3DSoRG67Q%3DeqH4ctdJDbOnzFSR6PjVZvQr-A8w%40mail.gmail.com>
<https://groups.google.com/d/__msgid/elasticsearch/__
CANjo42zgBXS4Y%3DSoRG67Q%3DeqH4ctdJDbOnzFSR6PjVZvQr-
A8w%40mail.gmail.com?utm___medium=email&utm_source=footer
<https://groups.google.com/d/msgid/elasticsearch/
CANjo42zgBXS4Y%3DSoRG67Q%3DeqH4ctdJDbOnzFSR6PjVZvQr-
A8w%40mail.gmail.com?utm_medium=email&utm_source=footer>>.
For more options, visit https://groups.google.com/d/__optout <
https://groups.google.com/d/optout>.
--
Costin
--
You received this message because you are subscribed to a topic in
the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/__
topic/elasticsearch/__BKR18lczF1w/unsubscribe
<https://groups.google.com/d/topic/elasticsearch/
BKR18lczF1w/unsubscribe>.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@__googlegroups.com
mailto:elasticsearch%2Bunsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/__msgid/elasticsearch/542EC4BA._
_60204%40gmail.com
<https://groups.google.com/d/msgid/elasticsearch/542EC4BA.
60204%40gmail.com>.
For more options, visit https://groups.google.com/d/__optout <
https://groups.google.com/d/optout>.
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to
elasticsearch+unsubscribe@googlegroups.com <mailto:elasticsearch+
unsubscribe@googlegroups.com>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/
CANjo42yKPnbH6R%3DrzM%3DrRBo%3DUqcD%2BbOy8%3D8y5Xo-
L2pT8MgtQA%40mail.gmail.com
<https://groups.google.com/d/msgid/elasticsearch/
CANjo42yKPnbH6R%3DrzM%3DrRBo%3DUqcD%2BbOy8%3D8y5Xo-
L2pT8MgtQA%40mail.gmail.com?utm_medium=email&utm_source=footer>.
For more options, visit https://groups.google.com/d/optout.