Newbee so maybe strange question. In my enviroments Kafka is used and the main reason for this is the ability to temporaly queue data. This functionality is (now) available in logstash.
So simple question, can we remove kafka?
I understand that you can also look at this the “micro service way” and still favor a spit in manipulation fase and a data director fase, but a container/logstash/volume looks to me the optimum solution?
Please comment. KR Henk
Can you give context on this? Kafka and Logstash are two completely different tools with completely different use cases.
What is the functionality that you say that Logstash have now?
if you are running a large enterprise data ingestion/aggregation system, then an Event Bus like Kafka is highly recommended. Some reasons include
- the downstream systems like Logstash/elastic or other 3rd party system may need restarting or updates frequently. This means you may loose data from UDP/TCP/streaming systems
- Event Bus (kafka) acts as buffering layer and smoothes the data into Elastic or downstream systems. The "velocity" & "veracity" of data is made much better using Event Bus
- Other systems can get the data from Kafka without bothering your team
- lot other reasons too
Overall it depends on the design/architecture of your platform and how much data resiliency you require
The big difference is that the persistent queue functionality in Logstash is specific to each node and does not run in clustered mode. If you lose a node you may therefore lose data. Kafka however supports running in clustered node and losing a node does generally not lead to data loss, and is therefore generally more resilient.