Elasticsearch spark very high CPU

israel · July 6, 2015, 12:26pm

Hi,

I have a problem with a Spark job I have created that consumes data from Kafka, write it to Cassandra and then index it to Elasticsearch. Even when there is no data streaming through kafka, the CPU load is sky high.
When I remove the 'save to ES' section, CPU is normal.
I have created a demo app and attached a link to it here.

Thanks,

Israel

costin · July 6, 2015, 1:15pm

Can you post a dump of your Cpu; potentially connect to the jvm using
jstack to see what causes the threads to spin the cpu? The demo is great
however as this is a runtime behavior it's kinda hard to reproduce it
outside your environment.

Thanks,

israel · July 6, 2015, 1:50pm

It happens on 3 different machines (osx, ubuntu and windows7)

israel · July 12, 2015, 8:51am

Hi Costin,

Attached 3 snapshots of jstack while the demo process is running and the CPU is very high.

gist.github.com

https://gist.github.com/israel/30cde540623f13484904#file-gistfile1-txt

gistfile1.txt

2015-07-12 11:39:27
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.45-b02 mixed mode):

"Attach Listener" #1755 daemon prio=9 os_prio=31 tid=0x00007fcc20632800 nid=0x2b46b waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"MemtablePostFlush:2" #1754 daemon prio=5 os_prio=31 tid=0x00007fcc14519800 nid=0x34f2f waiting on condition [0x0000000132d3e000]
   java.lang.Thread.State: TIMED_WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000007401bdf38> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)

This file has been truncated. show original

jstack2

2015-07-12 11:39:58
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.45-b02 mixed mode):

"Shutdown-checker" #1824 prio=5 os_prio=31 tid=0x00007fcc1d0a7000 nid=0x39133 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"cluster66-worker-0" #1821 prio=5 os_prio=31 tid=0x00007fcc2075d000 nid=0x3054b runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Shutdown-checker" #1820 prio=5 os_prio=31 tid=0x00007fcc1d67b800 nid=0x36e2b runnable [0x0000000000000000]

This file has been truncated. show original

jstack3

2015-07-12 11:40:08
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.45-b02 mixed mode):

"globalEventExecutor-1-42" #1826 prio=5 os_prio=31 tid=0x00007fcc1fbab800 nid=0x3662b runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Attach Listener" #1755 daemon prio=9 os_prio=31 tid=0x00007fcc20632800 nid=0x2b46b waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"MemtablePostFlush:2" #1754 daemon prio=5 os_prio=31 tid=0x00007fcc14519800 nid=0x34f2f waiting on condition [0x0000000132d3e000]

This file has been truncated. show original

Thanks,

Israel

costin · July 14, 2015, 6:19am

Took a quick look at the stacktrace but the vast majority of threads belong to cassandra or the cassandra connector. That's not to say that Cassandra is to blame here and/or the Elasticsearch connector has no impact rather it's hard to understand what is causing the JVM to eat all the CPU since all these apps look like they are running within the same VM.

Can you try potentially run each app in a separate VM; this will help isolate the hungry CPU process. Further more, while writing data to Elastic, can you run the hot threads API to see how Elasticsearch behaves?
Further more, you can try using Marvel or other monitoring plugins to understand while indexing, what's the impact on Elastic.

Where does the data in Elastic comes from? Can you minimize your example and eliminate for example Cassandra and index only to Elastic. Further more, can you first index the data from HDFS or the file-system and then add Kafka and see whether it makes a difference?

There are a lot of moving parts and it's unclear whether it's a certain component that eats the CPU or whether it's their interaction that it's causing the issue...

israel · July 14, 2015, 6:45am

Hi Costin,

Thank you for your help.

As for the small demo I have provided. It writes small amount of data to Kafka topic and then consumes it, write it to cassandra and then index to ElasticSearch.

If I remove the part of indexing to Elasticsearch, CPU is normal!

Also, What bothers more is that after the data has been consumed (after a few seconds) the CPU keeps being very high although there is no processing of data at all!!
I can run the hot threads API, but is it relevant when there is no data at all? (the RDDs are empty)
I will try running Indexing part in a separate JVM.

Thanks,

Israel

costin · July 14, 2015, 8:39am

Try first with a version that simply reads from Kafka and writes to Elasticsearch without Cassandra in between; use as little parts as possible.

israel · July 14, 2015, 11:11am

will do. thanks

israel · July 14, 2015, 11:47am

Once I have moved cassandra out of this JVM, everything is back to normal. Still don't know why... anyway, thanks for the help

israel · July 14, 2015, 11:49am

I mean cassandra server itself. still using cassandra connector to write to cassandra from same JVM

costin · July 14, 2015, 8:30pm

Interesting. It looks like there might be a tripping point - potentially the network layer/Netty that might cause the issue if multiple instances are running within the same JVM.
This is just a hunch either way, even without this issue I would strongly recommend to run each server / long-running application in its own JVM / space simply for better control/performance.

Topic		Replies	Views
Periodic CPU spikes Elasticsearch	9	2932	July 6, 2017
Elasticsearch process using ~100% of CPU Elasticsearch	7	2893	July 5, 2017
Elasticsearch eat 100% of cpu Elasticsearch	6	1162	October 20, 2020
Elasticsearch high cpu usage Elasticsearch	1	706	July 6, 2017
Elasticsearch is spanning multiple processes and threads resulting in 100% CPU utilization Elasticsearch	16	2267	August 15, 2018

Elasticsearch spark very high CPU

Related topics