In what situation you need ES-Hadoop? ES can be used as a search engine, analyze engine and even storage. Why we need ES-hadoop? Can you give some very specific example please?
Do you have hadoop?
If not then you probably don't need it.
Are you from Cisco? Because I seems found your name on a cisco blog. That BLOG talks about using ES to handle Netflow data.
To quote the docs:
Elasticsearch for Apache Hadoop is an open-source,
stand-alone, self-contained, small library that allows Hadoop jobs
(whether using Map/Reduce or libraries built upon it such as Hive, Pig
or Cascading or new upcoming libraries like Apache Spark ) to interact with Elasticsearch. One can think of it as a connector that allows data to flow bi-directionaly
so that applications can leverage transparently the Elasticsearch
engine capabilities to significantly enrich their capabilities and
increase the performance.
So.. It is like a tunnel between elasticsearch and hadoop. Data can flow between each other right?
You did, where was that?
ES-Hadoop is connector between elasticsearch and hadoop. Does that mean I should not store any data in HDFS and store everything in elasticsearch?
Without this connector, it was not possible to run MR jobs on data in elasticsearch. Es-hadoop allows that. So is it ok to have no or very low disk space given to HDFS ?