Ive been tasked with implimenting log stash as a means to get data into elastic and while Ive heard the name before I dont really know much more than that about it so to save me making some mistakes in the early stages wanted to get some advice on how best to go about it?
I hear some people impliment this on their elastic nodes while others do not are there pros abd cons to these approaches? I would have thought having log stash on its own box is better so it doesnt have an IO impact on elastic itself.
I have 3 nodes in elastic running Elastic 5.6.6 does this inform how many logstash nodes I should set up? Does logstash cluster in a simialr way to elastic?
Are there any othet dependencies I need to put in to get data from Logstash to Elasticsearch?
Are there any guides people can point me to to help me get set up quickly?
Are there any gotchas I shoulf be aware of going into this?
I hear some people impliment this on their elastic nodes while others do not are there pros abd cons to these approaches? I would have thought having log stash on its own box is better so it doesnt have an IO impact on elastic itself.
Segregating Logstash and Elasticsearch is typically a good idea but don't focus too much on deployment infrastructure at this point.
I have 3 nodes in elastic running Elastic 5.6.6 does this inform how many logstash nodes I should set up?
No.
Does logstash cluster in a simialr way to elastic?
No.
Are there any othet dependencies I need to put in to get data from Logstash to Elasticsearch?
Thanks for your feedback Magnus, do you have any guidance on the spec for the logstash nodes?
I'm looking to start with a VMs for logstash I'm guessing that's ok until I find that througput is such that they need dedicated disks or nics?
I have been reviewing some of the guidance on setting up logstash but I'm not seeing much by way of recommended specs for the servers? is there a minimum RAM reqirement? should I disable swap files to help performance?
The more I read the more applicatons I'm finding for it.
The primary focus at the moment is to log usage data for an API (where the API calls would output a log entry when called) and collate reports on that to better help us know where to direct our efforts in improving the API with information on what aspects are most of interest to end users so we make sure we market those popular aspects better.
Arugably a side project but I also data I move from RabbitMQ to HDFS & Hive on Hadoop which this might be able to help me with.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.