Elastisearch-Hadoop how to do a bulk search in spark program

tosaurabh85 · September 7, 2017, 2:21am

I am writing a spark program which is basically a RDD of Strings. What i need to to do is basically create a query per string and do the query based on Elastic search index. So essentially Query would differ on string. I wanted to use elasticsearch-hadoop to do the search so i can have optimizations. The RDD can be large and i m looking for any optimizations possible

For Example RDD is List[India, IBM Company , Netflix , Lebron James]. We will create More like this search on all these terms and do search on the Index Wikipedia and get back the results. For example we will create four more like this query for India and IBM and Netflix and Lebron James and get back the hits for them

I do have work around where i can use HTTP Rest Api call with Bulk search to get back the hits , but there i will be doing optimizations on my own . I wanted to see if we can use the spark elastic connector to create queries and do the search in optimized way

james.baiera · September 12, 2017, 3:15pm

This isn't really what ES-Hadoop is meant to be used for. ES-Hadoop/Spark is primarily a connector to ingest to and read from Elasticsearch over the bulk and scroll apis.

system · October 10, 2017, 3:15pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Is it possible to perform bulk insert from Spark to ElasticSearch? Elasticsearch es-hadoop	4	6517	July 6, 2017
Indexing data in bulk in Elasticsearch using PySpark Elasticsearch es-hadoop	1	1348	July 6, 2017
Slow Performance of Elastic Search with Spark Elasticsearch es-hadoop	4	1535	July 29, 2021
ESHadoop - Hadoop vs Spark Elasticsearch es-hadoop	3	1230	July 6, 2017
Performance of Spark bulk index to Elasticsearch Elasticsearch es-hadoop	3	2599	September 1, 2017

Elastisearch-Hadoop how to do a bulk search in spark program

Related topics