Logstash vs Filebeat in server farm

dorr · November 30, 2015, 10:37am

Hello.
I am currently examining how to deploy log shipping within my server farm, and trying to understand when is it right to use Logstash on the application servers and when it is right to use Filebeat. As you can understand I have multiple application servers and a single ELK server to which I want the logs delivered.

My dilemma is between the following topologies:

logstash(multiple) => redis -> logstash -> elasticsearch
filebeat(multiple) => logstash -> redis -> logstash -> elasticsearch

So, some questions:

Are the any guidelines when to implement these topologies? Why not to use Logstash on each app server?
I saw some mention of the JVM, but is there any documentation as to what is the load Logstash creates on each server?
Looking at the future, with the introduction of the Beats framework, are they going to take Logstash's place in a server farm scenario?

Thanks,
Dor.

magnusbaeck · November 30, 2015, 12:08pm

Are the any guidelines when to implement these topologies? Why not to use Logstash on each app server?

The main reason would be that Logstash is fairly heavy-weight. For a 40-core server with 100 GB of RAM it probably doesn't matter, but if your use case calls for two-core servers with 2 GB of RAM it is significant.

I saw some mention of the JVM, but is there any documentation as to what is the load Logstash creates on each server?

That depends on several factors, including how many events per second you want Logstash to process. Memory-wise I think you should expect a need of at least a couple of hundred MB.

Looking at the future, with the introduction of the Beats framework, are they going to take Logstash's place in a server farm scenario?

For shipping files I'd say that Filebeat is the way to go, but I don't expect the Beats stuff to be extended with general-purpose filters and flexible outputs like Logstash has today.

ruflin · November 30, 2015, 12:09pm

Some notes from my side here:

There is no clear criteria when to use filebeat on the client side and when to install full logstash. In general it can be said that filebeat uses much less resources then logstash but as always it depends on the setup etc. I can't give you a specific ratio.
It is not intended that Beats replace logstash, as one of the key criterias is to keep them as lightweight as possible. The impact of running one or multiple beats inside all your severs inside your server farm should be as minimal as possible. For things like grok and data transformation you will still need Logstash.
What is the main reason you are using redis and not filebeat -> LS -> ES? I know there are different reasons for this setup but it would be nice to hear what the reason is for you. Filebeat supports sending data to multiple LS instances for load balancing and guarantees at least once delivering.

dorr · November 30, 2015, 12:56pm

Thanks for the quick replies!
Most lines in my log files go into ElasticSearch so there is minimal grepping going on.
There is some use of functionality such as GeoIP, UserAgent and such, but I rather run it on the Logstash server and maintain it centrally instead of across all application servers.
So it seems I am delivering files, for which Filebeat is probably the better product.

I am considering Redis for two reasons:

Unrelated to products: Network topology issues which prevents a connection from the application servers to the Logstash server. However, that should change in the future.
Related to products: As far as I undestand, a Redis queue is a good solution to take the load off all other servers in the farm. Meaning, I don't even want Filebeat to try push into Logstash once and possibly fail, I want Logstash to pull when convinient. I was hoping to benefit from that as well, if my assumption is correct.

Alexander_Trumper · June 2, 2016, 12:13pm

I can agree on that. In my idea it could look like this:

|Internet|--> App Server --> Redis |DMZ Firewall| <-- Logstash

The idea is to have no connection from DMZ to internal networks.

We will try Apache Kafka instead of Redis, though, as there are some doubts regardings redis' performance. POC will show

priyam · June 9, 2017, 8:42am

My case is quite similar, am trying to decide whether to use filebeat as logs collector,processor and shipper or should use logstash.
I have gone through many write ups on filebeat vs logstash.. nearly all of them suggest that filebeat has smaller footprint and uses lesser resources than logstash however logstash has a more powerful data processing and wider range of inputs/outputs as compared to filebeat.

Can anyone help me with how exactly is data processing in logstash more powerful. i.e what is it that filebeat cannot do while enhancing/filtering data that logstash can... and what inputs does filebeat support(considering it is limited).

magnusbaeck · June 11, 2017, 7:54pm

Can anyone help me with how exactly is data processing in logstash more powerful. i.e what is it that filebeat cannot do while enhancing/filtering data that logstash can...

See Filter plugins | Logstash Reference [8.11] | Elastic for examples of what kind of operations you can do with Logstash filters. With very few exceptions none of those operations are possible with Filebeat.

and what inputs does filebeat support(considering it is limited).

Filebeat reads log files and nothing else.

priyam · June 12, 2017, 1:34pm

@magnusbaeck that was helpful.thanks.!

Topic		Replies	Views
Filebeat or Logstash at the application servers? Elastic Community and Ecosystem	3	885	November 24, 2017
Logstash VS Filebeat Logstash	4	4317	April 19, 2018
Doese a 'local' filebeat installation makes sense? Beats	6	1280	July 5, 2017
When should I use filebeat instead of logstash Beats filebeat	3	424	July 29, 2020
Is there a reason not to have logstash on each host Logstash	7	705	March 6, 2017

Logstash vs Filebeat in server farm

Related topics