External Filebeat plus Logstash best approach

Hi everyone,
our system takes data from thousand of clients. Basically, we develop a client software which each of our clients install and run. That SW connects to our system (in our network). Our idea is to install Filebeat in each client machine to connect to Logstash. We have Logstash servers but we don't want to expose them externally. So we don't know which would be the best approach to follow. Do we have a way to protect the Logstash servers? Should we create an specific and own Logstash instance? Could we do that in Mesos using Docker? If we have to do that, how can we expose the IP? Or should we expose a microservice? How do we deal with the protocol used by Filebeat and Logstash, and HTTP?
I really appreciate your help.
Thanks!

If you don't expose something externally I don't see how you're going to pull this off. You could put one or more Logstash servers in a DMZ and allow external access to those servers. If you don't want the DMZ servers to be able to connect to servers in your internal network you can spin up a broker (Redis, RabbitMQ, ...) in the DMZ and pull from those queues from the internal network.

Systems don't want to connect directly to Logstash servers because any flaw in Logstash could be exploited. Could you elaborate the Redis solution? How woulf Filebeat connect to that?

Thanks.

Filebeat has built-in Redis support (https://www.elastic.co/guide/en/beats/filebeat/current/redis-output.html), so if it's acceptable for clients to connect to Redis then that's a good option. You would then have a Redis server in the DMZ and would connect to it from your internal network to pull the messages.

Can a Logstash instance receive data from both a Redis queue and a Filebeat? Or should we create a new Logstash instance for each? Besides my problem, I mean, if I can or cannot expose a Logstash server, isn't it better to have Redis as buffer? Because I read that Redis can act as a buffer. If I understood ok, Logstash will ask for the events to Redis only when it has time to deal with it. Is it so?

Can a Logstash instance receive data from both a Redis queue and a Filebeat?

Yes.

Besides my problem, I mean, if I can or cannot expose a Logstash server, isn't it better to have Redis as buffer?

Sure.

Because I read that Redis can act as a buffer. If I understood ok, Logstash will ask for the events to Redis only when it has time to deal with it. Is it so?

Logstash reads from all configured inputs with the same priority.

Ok. What is the advantage of put Redis between Filebeat and Logstash?

It acts as a buffer that can absorb message surges. It also makes it easy to load balance the event processing by spinning up multiple Logstash instances.

Also, does that even matter if the external clients aren't allowed to connect directly to Logstash?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.