I read a lot of docs about ELK architecture and I saw different designs. My team has to deploy a new ELK stack and we are facing with some choices.
beats ==> kafka ==> logstash ==> elasticsearch
beats ==> kafka ==> elasticsearch (where beats have the role of logstash)
beats ==> logstash ==> elasticsearch
beats ==> elasticsearch
All kind of logs should be parsed and sent to ELK stack : firewall, proxy (nginx) , LB (haproxy, kemp), syslog, windows logs, journal logs from linux machines, ....
The workload is around 250Go / day. We want to use Kafka as a buffer.
It seems that logstash is becoming replaced by beats ? Any word about this ?
If you can do option 4, I'd go for it as there are much less things to maintain
Let me answer to:
It seems that logstash is becoming replaced by beats ? Any word about this ?
and
(where beats have the role of logstash)
About beats/logstash, that's not the same thing. You can compare logstash inputs vs beats and compare logstash filters vs elasticsearch node ingest feature though.
In short: if you can do all the parsing/processing of your data in elasticsearch with ingest node, just use beats and Elasticsearch.
If you have a more advanced pipeline or if you want to send the data collected by beats to elasticsearch AND to another output such as a storage or kafka or whatever, then use Logstash.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.