I am a beginner in elasticsearch.I am able to do all bulk indexing using Java API.
ES is really good mechanism for searching purpose.But I am wondering why people talk about Mapreduce with ES.
Lets take the example I am trying:
I have a db column which contains 20000 records which contains duplicate datas.
I have put those records and indexed in ES and I want to get the matching records from elastic search.
Basic Doubt :After indexing in ES do I have to put the file in HDFS to run mapreduce program.If I have to put file HDFS and process then I what is the purpose of indexing here.
Hope I am clear enough to make others understand about my query.