Logstash vs Filebeat in server farm


(Dor Rotman) #1

Hello.
I am currently examining how to deploy log shipping within my server farm, and trying to understand when is it right to use Logstash on the application servers and when it is right to use Filebeat. As you can understand I have multiple application servers and a single ELK server to which I want the logs delivered.

My dilemma is between the following topologies:

  • logstash(multiple) => redis -> logstash -> elasticsearch
  • filebeat(multiple) => logstash -> redis -> logstash -> elasticsearch

So, some questions:

  • Are the any guidelines when to implement these topologies? Why not to use Logstash on each app server?

  • I saw some mention of the JVM, but is there any documentation as to what is the load Logstash creates on each server?

  • Looking at the future, with the introduction of the Beats framework, are they going to take Logstash's place in a server farm scenario?

Thanks,
Dor.


(Magnus Bäck) #2

Are the any guidelines when to implement these topologies? Why not to use Logstash on each app server?

The main reason would be that Logstash is fairly heavy-weight. For a 40-core server with 100 GB of RAM it probably doesn't matter, but if your use case calls for two-core servers with 2 GB of RAM it is significant.

I saw some mention of the JVM, but is there any documentation as to what is the load Logstash creates on each server?

That depends on several factors, including how many events per second you want Logstash to process. Memory-wise I think you should expect a need of at least a couple of hundred MB.

Looking at the future, with the introduction of the Beats framework, are they going to take Logstash's place in a server farm scenario?

For shipping files I'd say that Filebeat is the way to go, but I don't expect the Beats stuff to be extended with general-purpose filters and flexible outputs like Logstash has today.


(ruflin) #3

Some notes from my side here:

  • There is no clear criteria when to use filebeat on the client side and when to install full logstash. In general it can be said that filebeat uses much less resources then logstash but as always it depends on the setup etc. I can't give you a specific ratio.
  • It is not intended that Beats replace logstash, as one of the key criterias is to keep them as lightweight as possible. The impact of running one or multiple beats inside all your severs inside your server farm should be as minimal as possible. For things like grok and data transformation you will still need Logstash.
  • What is the main reason you are using redis and not filebeat -> LS -> ES? I know there are different reasons for this setup but it would be nice to hear what the reason is for you. Filebeat supports sending data to multiple LS instances for load balancing and guarantees at least once delivering.

(Dor Rotman) #4

Thanks for the quick replies!
Most lines in my log files go into ElasticSearch so there is minimal grepping going on.
There is some use of functionality such as GeoIP, UserAgent and such, but I rather run it on the Logstash server and maintain it centrally instead of across all application servers.
So it seems I am delivering files, for which Filebeat is probably the better product.

I am considering Redis for two reasons:

  1. Unrelated to products: Network topology issues which prevents a connection from the application servers to the Logstash server. However, that should change in the future.
  2. Related to products: As far as I undestand, a Redis queue is a good solution to take the load off all other servers in the farm. Meaning, I don't even want Filebeat to try push into Logstash once and possibly fail, I want Logstash to pull when convinient. I was hoping to benefit from that as well, if my assumption is correct.

#5

I can agree on that. In my idea it could look like this:

|Internet|--> App Server --> Redis |DMZ Firewall| <-- Logstash

The idea is to have no connection from DMZ to internal networks.

We will try Apache Kafka instead of Redis, though, as there are some doubts regardings redis' performance. POC will show :wink:


(Priyam Shukla) #6

My case is quite similar, am trying to decide whether to use filebeat as logs collector,processor and shipper or should use logstash.
I have gone through many write ups on filebeat vs logstash.. nearly all of them suggest that filebeat has smaller footprint and uses lesser resources than logstash however logstash has a more powerful data processing and wider range of inputs/outputs as compared to filebeat.

Can anyone help me with how exactly is data processing in logstash more powerful. i.e what is it that filebeat cannot do while enhancing/filtering data that logstash can... and what inputs does filebeat support(considering it is limited).


(Magnus Bäck) #7

Can anyone help me with how exactly is data processing in logstash more powerful. i.e what is it that filebeat cannot do while enhancing/filtering data that logstash can...

See https://www.elastic.co/guide/en/logstash/current/filter-plugins.html for examples of what kind of operations you can do with Logstash filters. With very few exceptions none of those operations are possible with Filebeat.

and what inputs does filebeat support(considering it is limited).

Filebeat reads log files and nothing else.


(Priyam Shukla) #8

@magnusbaeck that was helpful.thanks.!