I am processing huge data with cascading and dumping it into HDFS. Now I want to perform search on this data. So i thought of these solutions,
-
I am using ESTap() of elastic search as sink tap to dump data and then use ElasticSearch for searching this data.
But the time taken for dumping data is too high. (as compared to time taken by default HFS Sink tap). -
Should i dump the data normally with HFS sink tap of cascading into HDFS and then use the es-hadoop to move data to elastic search for search operations ?
Please tell me which is the best method for processing huge data? Also which approach is correct or wrong.??
Thanks in advance.