How to read big index (ten GB) from elastic

momocg · January 9, 2018, 11:04am

I want to read big index( ten GB) from elasticsearch with spark. Is there any advices ?

james.baiera · January 11, 2018, 9:38pm

This is a pretty open question. To that I will say that you'll want to make sure that your index is properly sharded so that your spark job can read with high parallelism. Also, be sure to filter any data that you do not need for your spark processing out ahead of time using the pushdown query properties. If you are using Spark SQL, it allows you to pushdown sql filters and projections to Elasticsearch automatically by using the optimizer.

system · February 8, 2018, 9:53pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Is it recommended to read/load 1TB of elastic search indexed data directly from spark Elasticsearch	1	394	March 13, 2019
Reading from Elasticsearch index using spark ( es-hadoop ) connectors Elasticsearch es-hadoop	2	1734	March 22, 2022
Performance of Spark bulk index to Elasticsearch Elasticsearch es-hadoop	3	2650	September 1, 2017
Using Apache Spark for elasticsearch indexing Elasticsearch es-hadoop	3	836	July 6, 2017
Can we read data from index into spark by query? Elasticsearch es-hadoop	4	1010	July 6, 2017

How to read big index (ten GB) from elastic

Related topics