Low performance of Spark streaming with elastic search

Johnnie · February 2, 2017, 1:27am

Hi,

Our spark streaming user case is, read streaming data from Kafka, and join with index from Elastic Search.
There is another spark streaming job update the Elastic Search index at a fix interval, which means the index data is not static.

Platform details is as below.
Spark 1.6.2 standalone cluster with 15 nodes. 90 cores 500G memory.
Elastic Search 2.4 with 10 data nodes. about 1Billion documents in Elastic search about 250G. 20 shards.

Our interval is 1hour and the job read the whole index data from Elastic search is 50 minutes.

is there any way to improve the performance of the job?

Thanks

system · March 2, 2017, 1:28am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Tunning ElasticSearch with Spark Elasticsearch	1	405	July 5, 2017
Performance Challenge Elasticsearch es-hadoop	6	1120	April 28, 2017
Processing data using spark streaming before indexing in elasticsearch Elasticsearch es-hadoop	2	591	July 6, 2017
Using Pig/Spark on ElasticSearch (as External Storage) Elasticsearch	3	446	July 6, 2017
Error job spark streaming elasticsearch Elasticsearch es-hadoop	2	1505	February 21, 2018

Low performance of Spark streaming with elastic search

Related topics