Insert data into Elasticsearch from Hive in real-time

(Hamdi Charef) #1

I'm asking about a case in wich to insert real time into Elasticsearch from Hive,
My case is: i have created the external table and inserted existed data Hive into Elasticsearch,
All thing work fine,now for example if new data inserted into Hive,how it's possible to insert the new data into Elasticsearch.
Thanks for any additional information.

(Costin Leau) #2

You will need to figure out the delta data or the new data that was inserted in Hive vs ES.
As you new data comes along, you can insert it in Hive and ES or, if it's a separate job, compute the delta (by doing a query for example on a field) and then inserting the results into ES.

(Hamdi Charef) #3

Here i post my use case:
i stored all my data in external table in Hive using
CREATE EXTERNAL TABLE TableName(...) STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES('es.resource' = 'radio/artists');
so with that i have access to y data in ES.
Can i update the data in the Hive table and if Yes the ES index will be updated also?

(Costin Leau) #4

The table definition is really just a view - to control whether the data is indexed vs appended, see the es.write.operationas explained here.

(system) #5