Read from elasticsearch into Hive

Hi,

In the documentation, Hive external table is being used to map the data from Elasticsearch index.

Does the external table in Hive store the data once it is created? say in user/hive/warehouse?
Or for every select query, the data is fetched from elasticsearch instead?

I configured the elasticsearch for hadoop and using this with with Hive.

I created an external table to read ES data into Hive. After I performed a select query, I can see the data on Hive. But, the external table's storage directory (in my case/user/hive/warehouse/<table_name>) is empty. I cannot see any data in there. So, Hive doesn't store the data? For each query in Hive, it fetches the data from elasticsearch?

Hi,
the data stays in Elasticsearch. The External table works like a view of the ES index. You can even drop the external table and your index will not be deleted. And yes, every time you query the external table you are querying elasticsearch again (that is slow). So I use to create an external table and then I create new tables in hive selecting the data from the external table. This new tables are permanent hive tables.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.