Elasticsearch Hadoop


(Badal Mohapatra) #1

Hi,

To index Hadoop data into elasticsearch as I understand,
We create an external table with essstorage handler and then copy the data
from another internal hive table doesn't it duplicate the data in HDFS?
Is there any way to use the hive internal tables directly to index instead
of having two tables with same data?

Kind Regards,
Badal

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ed08fd38-05e4-437a-a8e2-3295f2195e2a%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Costin Leau) #2

There is no duplication per-se in HDFS. Hive tables are just 'views' of data - one sits unindexed, in raw format in HDFS
the other one is indexed and analyzed in Elasticsearch.

You can't combine the two since they are completely different things - one is a file-system, the other one is a search
and analytics engine.

On 09/01/2014 9:49 AM, Badal Mohapatra wrote:

Hi,

To index Hadoop data into elasticsearch as I understand,

We create an external table with essstorage handler and then copy the data from another internal hive table doesn't it
duplicate the data in HDFS?
Is there any way to use the hive internal tables directly to index instead of having two tables with same data?

Kind Regards,
Badal

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ed08fd38-05e4-437a-a8e2-3295f2195e2a%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/52EF730F.4060508%40gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #3