Mapper turns out to be very slow


(machong) #1

I'm using eshadoop lib to do some data process work. the hadoop application seems to get stuck when the mapper progress reached 80%. I tried to add some debug message into mapper and found that the program is not blocked nor has any error but just running very slow(too slow to bear).

I found below message in GC Log(elasticsearch), it guess it has some kind of correlation with this problem.

{Heap before GC invocations=0 (full 1):
par new generation total 18874368K, used 16777216K [0x0000000180000000, 0x0000000680000000, 0x0000000680000000)
eden space 16777216K, 100% used [0x0000000180000000, 0x0000000580000000, 0x0000000580000000)
from space 2097152K, 0% used [0x0000000580000000, 0x0000000580000000, 0x0000000600000000)
to space 2097152K, 0% used [0x0000000600000000, 0x0000000600000000, 0x0000000680000000)
concurrent mark-sweep generation total 5242880K, used 0K [0x0000000680000000, 0x00000007c0000000, 0x00000007c0000000)
Metaspace used 59486K, capacity 64396K, committed 64456K, reserved 1103872K
class space used 8006K, capacity 9715K, committed 9768K, reserved 1048576K
2018-05-15T15:18:43.846+0800: 14.899: [GC (Allocation Failure) 2018-05-15T15:18:43.846+0800: 14.899: [ParNew
Desired survivor size 1073741824 bytes, new threshold 15 (max 15)

  • age 1: 261086432 bytes, 261086432 total
    : 16777216K->255372K(18874368K), 0.1813307 secs] 16777216K->255372K(24117248K), 0.1814428 secs] [Times: user=1.30 sys=0.26, real=0.18 secs]

any one can help to give some suggestion ?


(machong) #2

By the way, i'v try to enlarged the jvm heap of elastic search to 25g, but it did't help


(James Baiera) #3

Can you perform a heap dump? Perhaps that can highlight what is taking up all the space in eden


(machong) #4

heap dump as below, occupancy rate of eden space is not very high, but at this time the mapper get to run in rather too slow speed when the progress reaches 80 percentage.

S0 S1 E O M CCS YGC YGCT FGC FGCT GCT
0.00 2.41 10.01 44.25 91.94 82.60 47 5.440 2 0.619 6.059
0.00 2.41 10.88 44.25 91.94 82.60 47 5.440 2 0.619 6.059
0.00 2.41 11.27 44.25 91.94 82.60 47 5.440 2 0.619 6.059
0.00 2.41 12.11 44.25 91.94 82.60 47 5.440 2 0.619 6.059
0.00 2.41 13.27 44.25 91.94 82.60 47 5.440 2 0.619 6.059
0.00 2.41 14.92 44.25 91.94 82.60 47 5.440 2 0.619 6.059
0.00 2.41 16.30 44.25 91.94 82.60 47 5.440 2 0.619 6.059
0.00 2.41 18.09 44.25 91.94 82.60 47 5.440 2 0.619 6.059
0.00 2.41 19.75 44.25 91.94 82.60 47 5.440 2 0.619 6.059

Hadoop app console out:

2018-05-17 11:41:44,767 INFO mapreduce.Job: Running job: job_1526364289182_0011
2018-05-17 11:41:51,941 INFO mapreduce.Job: Job job_1526364289182_0011 running in uber mode : false
2018-05-17 11:41:51,943 INFO mapreduce.Job: map 0% reduce 0%
2018-05-17 11:42:04,125 INFO mapreduce.Job: map 4% reduce 0%
2018-05-17 11:42:05,147 INFO mapreduce.Job: map 10% reduce 0%
2018-05-17 11:42:06,157 INFO mapreduce.Job: map 14% reduce 0%
2018-05-17 11:42:07,166 INFO mapreduce.Job: map 20% reduce 0%
2018-05-17 11:42:08,173 INFO mapreduce.Job: map 34% reduce 0%
2018-05-17 11:42:09,179 INFO mapreduce.Job: map 56% reduce 0%
2018-05-17 11:42:10,187 INFO mapreduce.Job: map 80% reduce 0%
2018-05-17 11:42:24,265 INFO mapreduce.Job: map 80% reduce 27%


(machong) #5

add a piece of jstack dump for reference:
"elasticsearch[node-1][search][T#3]" #144 daemon prio=5 os_prio=0 tid=0x00007f0e6c015ef0 nid=0x33ea4 runnable [0x00007f0e5811f000]
java.lang.Thread.State: RUNNABLE
at org.apache.lucene.codecs.blocktree.SegmentTermsEnum.postings(SegmentTermsEnum.java:998)
at org.elasticsearch.search.slice.TermsSliceQuery.build(TermsSliceQuery.java:87)
at org.elasticsearch.search.slice.TermsSliceQuery.access$000(TermsSliceQuery.java:49)
at org.elasticsearch.search.slice.TermsSliceQuery$1.scorer(TermsSliceQuery.java:62)
at org.apache.lucene.search.Weight.scorerSupplier(Weight.java:113)
at org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:329)
at org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:329)
at org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:295)
at org.apache.lucene.search.Weight.bulkScorer(Weight.java:147)
at org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:289)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:657)
at org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:191)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:462)
at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:265)
at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:107)
at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:466)
at org.elasticsearch.action.search.SearchTransportService$9.messageReceived(SearchTransportService.java:424)
at org.elasticsearch.action.search.SearchTransportService$9.messageReceived(SearchTransportService.java:421)
at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:258)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.messageReceived(SecurityServerTransportInterceptor.java:316)
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66)
at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:656)
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:635)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:41)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


(machong) #6

James,
i have a new discovery, the number of splits may have an impact on the performance of mapper.
When i make es.input.max.docs.per.partition times of what it was to decrease the number of splits,
my application had obvious increase in speed. However, the time it took is still too long for me, is there
any other way to make it faster?


(James Baiera) #7

We've seen in some cases the slice feature can cause some slow downs. If you set es.input.max.docs.per.partition to a really high number, like max int, how does it do? Which version of the connector and ES are you running?


(system) #8

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.