Elastisearch-Hadoop how to do a bulk search in spark program


(Saurabh Sharma) #1

I am writing a spark program which is basically a RDD of Strings. What i need to to do is basically create a query per string and do the query based on Elastic search index. So essentially Query would differ on string. I wanted to use elasticsearch-hadoop to do the search so i can have optimizations. The RDD can be large and i m looking for any optimizations possible

For Example RDD is List[India, IBM Company , Netflix , Lebron James]. We will create More like this search on all these terms and do search on the Index Wikipedia and get back the results. For example we will create four more like this query for India and IBM and Netflix and Lebron James and get back the hits for them

I do have work around where i can use HTTP Rest Api call with Bulk search to get back the hits , but there i will be doing optimizations on my own . I wanted to see if we can use the spark elastic connector to create queries and do the search in optimized way


(James Baiera) #2

This isn't really what ES-Hadoop is meant to be used for. ES-Hadoop/Spark is primarily a connector to ingest to and read from Elasticsearch over the bulk and scroll apis.


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.