Integration of hadoop (specifically HDFS files) with ELK stack

I am trying to integrate hadoop with ELK stack. My use case is " i have to get a data from a file present in HDFS path (AVRO format) and show the contents on kibana dashboard"

Anybody is having any article with step by step process?

Unfortunately I don't have any step-by-step content for doing this, but assuming that you are hosting YARN with your distribution of HDFS, the easiest process for this might be to use something like Spark to read the AVRO files in parallel and ship it to Elasticsearch using ES-Hadoop.

I think the fact that the data is in AVRO format will be more of a limiting factor than the fact that it lives on HDFS. Spark and other Hadoop ecosystem technologies are usually better suited to ETL of that kind of data than other out of the box tools.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.