What does ES-Hadoop do?


(Kramer Li) #1

In what situation you need ES-Hadoop? ES can be used as a search engine, analyze engine and even storage. Why we need ES-hadoop? Can you give some very specific example please?


(Mark Walkom) #2

Do you have hadoop?
If not then you probably don't need it.


(Kramer Li) #3

Hi Mark

Are you from Cisco? Because I seems found your name on a cisco blog. That BLOG talks about using ES to handle Netflow data.

:slightly_smiling:

Regards
Mingwei


(Costin Leau) #4

To quote the docs:

Elasticsearch for Apache Hadoop is an open-source,
stand-alone, self-contained, small library that allows Hadoop jobs
(whether using Map/Reduce or libraries built upon it such as Hive, Pig
or Cascading or new upcoming libraries like Apache Spark ) to interact with Elasticsearch. One can think of it as a connector that allows data to flow bi-directionaly
so that applications can leverage transparently the Elasticsearch
engine capabilities to significantly enrich their capabilities and
increase the performance.


(Kramer Li) #5

So.. It is like a tunnel between elasticsearch and hadoop. Data can flow between each other right?


(Mark Walkom) #6

You did, where was that?


(vinayak shukre) #7

ES-Hadoop is connector between elasticsearch and hadoop. Does that mean I should not store any data in HDFS and store everything in elasticsearch?

Without this connector, it was not possible to run MR jobs on data in elasticsearch. Es-hadoop allows that. So is it ok to have no or very low disk space given to HDFS ?


(system) #8