How can i pull data from hive to logstash

Hi,

I'm very new to Logstash. I have access to read data from database, so I decided to pull data from a remote database to logstash. The data size will be minimum 1TB per day.
How can i do capacity planning in terms of storage/networking/cpu/memory?
What are the best practices I have to follow?

Thanks,
Saravanan

you can use the various database plugins in logstash .

for example jdbcJDBC plugin

Thanks for the response joseph :slight_smile:

yes I can use jdbc plugin to pull data but I want to do capacity planning for that.

Is it possible to pull data TB's of data with one logstash instance ? or I need to setup clustering?. How clustering will work for pulling data from remote database to Logstash?

You might have to do the horizontal scaling of your logstash pipeline.

scaling might give you some generic thoughts on the ELK stack scaling.

For your case there will be slight modification on the inputs.

Also, Please check the JDBC streaming plugins, which might be useful for your case.

okay..

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.