i want to make a big ELK Installation.
At the one site we want have logstash Redis and Logstash Forwarder (Satelit)
at the other site we want have Logstash Redis Logstash Elasticsearch Kibana (Master).
But what in now really know at the Satelit Site. How can i get the data from redis to Logstash Forwarder?
Are you actually talking about logstash-forwarder, a separate (deprecated) product found at https://github.com/elastic/logstash-forwarder, or Logstash acting as a forwarder?
And why would you want to get data from Redis to Logstash/logstash-forwarder? Redis would typically act as a buffer from which Logstash in the master server would pull events.
Between the Satelit Installation and the Master installation is a VPN Tunnel.
So if the vpn tunnel is not available at the moment then we shouldn't lose the collected logs.
In that case we need a buffer like redis between logstash to get the logs from some devices and this logs should go to redis and to logstash forwarder. If the VPN Tunnel is online the logstash forwarder should ship the data to the first logstash master installation.
Ship directly to Logstash if the VPN tunnel is up and otherwise buffer in Redis? No, that doesn't make sense. Just buffer the events in Redis all the time.
Sorry, for the misunderstanding.
Sure should redis buffer the events all the time. What i meant is that Redis is then really important when the VPN tunnel is broken. What i only should know how can i get the redis data into the logstash-forwarder?
Logstash-Forwarder has been replaced by Filebeat. The normal flow in this scenario is something like this:
Filebeat ---> Logstash --->Redis ----(DC Boundary)---> Logstash ---> Elasticsearch
If you only have file based inputs, these generally handle stoppages in the pipeline quite well as they can just stop reading until the rest of the processing pipeline clears and then continue. If you only have file inputs and will not risk losing data due to aggressive log rotation, you may be able to do without using Redis and send data directly from Filebeat to the Logstash instance in the remote DC.
Other types of inputs, e.g. inputs based on TCP and/or UDP, are often not able to stop processing without causing problems upstream or losing data. If you have these types of inputs, using a message queue like e.g. Redis for buffering is often recommended.
But in our case we need a buffer on the satellite and the master site.
I think that is a possible Installation Setup that we have in mind:
Logstash--> Redis --> Filebeat/Logstash-Forwarder
Our Problem is now, how can we get some informations from redis to Filebeat/Logstash-Forwarder.
We need also a secure communicaiton between the Filebeat/Logstash-Forwarder(Satellite) and the first Logstash (Master) with for example lumberjack.
Could you help us to get this informations or where we can find them?
I hope that describes our situation.
Filebeat, and Logstash-Forwarder before it, is a data collection agent, and as such almost always places at the start of the kindest pipeline, not in the middle. Logstasg is able to read and write to/from Redis and can also connect to other Logstash instances via the lumberjack protocol, which like the beats protocol offers encryption and compression.
The architecture you describe sounds complex, and I would like to understand whether it could be simplified. Do you have inputs that are not file based? Why do you need Redis in both locations? Is it because you want t be able to buffer locally collected logs?
sorry for the late reply.
I was tryining to collect different types of data including syslog and netflow.
I was tryining to have a satelite instance on the branch offices.
Is there any guide or other resource to use to Setup my Environment?
Thanks in advance.
Can no one help me with this issue?
Has nobody a solution for me?
I need as Christian right said redis only as a buffer.
The latest version of Logstash has support for persistent queues (PQ), allowing received data to be written to disk, which can be used as a buffer. You should therefore now be able to deploy it as follows:
Filebeat ---> Logstash (with PQ) ---(VPN tunnel)---> Logstash (with PQ) ---> Elasticsearch
The first Logstash instance can also collect data that is not file-based through other input plugins. The Logstash to Logstash link supports encryption as well as compression.
Hey Christian, thank you for your quick reply.
The Site where you have planned the filebeat should be our Remote Location.
The Site where you have planned the elasticsearch should be our Central Location.
All Logs (Windows Logs, Firewall Logs, Switch Logs...) on the Remote Location should be collected from one Collector, i think it should be Logstash, because Filebeat has no input for logs over the port 514.
So is the filebeat really needed on the Remote Site? On the Remote Site should be all applications (Logstash, Filebeat?, ..) on one Server. For the Central Site is for every application one Server planned. And we have planned with 2 elasticsearch Servers because of the amount of logs.
If you have no logs that are file-based you may not need to use Filebeat. I would recommend using 3 Elasticsearch nodes if you are looking for resilience and high availability.
many thanks for your help.
I built up the solution as you described.