How es-hadoop get data

t7s · November 25, 2015, 8:02am

I know the es-hadoop will find shards to get data, but
when dealing with shards, is es-hadoop get all the data then hadoop/spark process the rest steps to get wanted data or es-hadoop use routing mechanism to query to get wanted data

For example, when I use es-hadoop to find records satisfied gender=male, those records will be returned by shards directly or es-hadoop got all data from target shards then obtain the result by iterate whole data？

costin · November 25, 2015, 9:26pm

This is explain in the reference doc, in particular in the architecture chapter.
In short, es-hadoop will get only the data needed from each shard; it would be highly ineffective and frankly pointless to get all the data and filter things in memory (why would it do that when ES can do all this itself)?

Topic		Replies	Views
Elasticsearch and Hadoop Questions Elasticsearch	7	403	July 6, 2017
Hive read es data slow Elasticsearch es-hadoop	5	1168	December 20, 2019
Is there a way to query Hadoop from ES? Elasticsearch	2	389	July 5, 2017
ElasticSearch Hadoop Elasticsearch	2	353	July 6, 2017
Usecase for Elasticsearch for Hadoop Elasticsearch	2	353	July 6, 2017

How es-hadoop get data

Related topics