Collect data from HIVE

Divit_Sharma · July 24, 2018, 1:08pm

I am trying to index data from Hive to ES. I have read the documentation online which basically tells you to first create a external table and then start inserting data into it. Now for inserting data into this table at a regular interval do we need to create a shell script or is there any other way?

It was easier getting data from oracle. Can we do this using logstash.config file?

james.baiera · July 25, 2018, 3:04pm

Hive is specifically a batch processing tool, so if you want to schedule it to do periodic transfers to Elasticsearch you will need to either write some sort of script or use some sort of scheduling solution to run your queries regularly.

Logstash is a great way to move data to Elasticsearch if you are looking for more of a stream processing approach. As the data is available, your Logstash implementation could push it to Elasticsearch for immediate use. Feel free to poke around the Logstash documentation and the Logstash part of these forums. There's plenty of people around that are happy to help with any questions.

Divit_Sharma · July 25, 2018, 3:21pm

@james.baiera Thanks for the reply. I am using logstash to fetch data from oracle tables and logs. Now I need to fetch data from Hive tables for visualisation. I think there are two approaches for this since logstash won't work here.

To use ES hadoop and make a external table and then make a script which sends data to that table.
Make a script which would fetch data from Hive table to a csv file and then I will ftp it to ELK server where I will use logstash to get the data.

The second way saves me the pain of additional installation of Es hadoop jars etc.

james.baiera · July 25, 2018, 3:24pm

I think either way would work. Depending on how much data you have in Hive, the ES-Hadoop route might be a bit faster, but if the tradeoff in terms of extra installations and updates is worth it, then by all means, use the approach that works best for you!

system · August 22, 2018, 3:24pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to push data from Hadoop to ES? Elasticsearch es-hadoop	6	4215	July 21, 2017
Fetching Data from Hive tables Elasticsearch es-hadoop	2	853	August 22, 2018
Hive real time or near real time sync with ES Elasticsearch es-hadoop	5	1693	November 13, 2019
How do I load data from hive to elasticsearch 6.5 using logstash? Logstash	2	291	February 5, 2019
Hive external table automatically send data to elasticsearch Elasticsearch es-hadoop	2	857	July 6, 2017

Collect data from HIVE

Related topics