Save and search data with es & hadoop

Benny · December 16, 2015, 6:59am

I want to save about 50TB/day's data from logstash and search them with elasticsearch. Is there any good solution for it?
According to offical introductions, I only can think out the below three ways, but I cannot make sure wethere there are correct, because of the large data. Could anyone help me please? Thanks a lot.

Solution#1:
1)Logstash outputs data into elasticsearch.
2)Elasticsearch uses snapshot to keep a backup in hadoop(hdfs), and can restore it.

Solution#2:
1)Logstash outputs data into hadoop(hdfs).
2)Mount hdfs as a local fs with NFS.
3)Elasticsearch the local hdfs.

Solution#3:
1)Logstash output data into hadoop(hdfs).
2)Create a table and and external table which is STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler' in Hive.
3)Load data into table.
4)Elasticsearch can index the data.

Benny · December 16, 2015, 7:47am

By reading Costin's answers for other questions, it seemed that the solution#2 should be no good because of the performance and so on....

Benny · December 18, 2015, 2:10am

Could anyone help me please? Thanks a lot!

costin · December 18, 2015, 1:13pm

Why do you want to use Hadoop? It might sound like a weird question but do consider it?
ES-Hadoop is useful when the data is in Hadoop and you try to get it into ES. Or you have a computational grid like Spark where you crunch numbers and need to tap into the data in ES.

By using logstash, it looks like you already have the means to move data from your source to Elasticsearch. So why go through Hadoop?
Note that involving a different system means allocating the necessary resources. Your 50TB would have to sit in Hadoop (until being consumed) and in Elasticsearch. Plus the network and CPU overhead to move them across to HDFS and then to Elasticsearch.

Topic		Replies	Views
Hadoop / Elasticsearch functionality Elasticsearch es-hadoop	20	3331	July 6, 2017
ELK and Hadoop integration Elasticsearch es-hadoop	6	6625	July 6, 2017
HDFS as elastic search data repository Elasticsearch	7	1030	July 5, 2017
Usecase for Elasticsearch for Hadoop Elasticsearch	2	349	July 6, 2017
Query on Indexing using es-hadoop Elasticsearch es-hadoop	6	1980	July 6, 2017

Save and search data with es & hadoop

Related topics