Shipping to Logstash vs Elasticsearch, and pipelines. Also modules?

hueyg · October 10, 2019, 6:21pm

Sorry for the rambling topic, but I am falling down a rabbit hole. Once I feel like a get a handle on ELK terminology and infrastructure the floor drops.

I have never completely understood why you would ship a beat directly to ES? Is it not responsible for the storage and indexing of data for querying? I thought for parsing of the logs events you had to use logstash. Does that mean that parsing can take place at beats level now before it is shipped out and in some cases logstash is not even needed?

And pipelines? What are these and how are they different than shipping via the beat to logstash? Why would I use one over the other? I did try to pull this information from the documentation. To my knowledge I have never setup a pipeline and have tons of data shipping. Am I missing out on something?

Finally modules. I am not totally clear on what they accomplish. I thought they are predefined parsing methods for known products, but if that is the case then why are they tied to beats and not tied to logstash which I thought did all of the parsing? Andy why are they tied to pipelines?

Is there any documentation that gives a broad overview of this? I can only fine details for each in the documentation but not why or how they all act together.

Thanks for everyone's time.

rugenl · October 11, 2019, 1:05am

I'm going to start with an example. Say you use filebeat, enable the apache module and output to elastic and had proper authority to create templates, ingest pipelines and ILM policies. The first filebeat (for each version, like 6.7.1. 6.8.2 etc) will setup templates and ingest pipelines for the apache module. (Or use the filebeat setup commands)

There is a tool that will convert an elastic ingest pipeline to a logstash pipeline. It may not convert all parts of all pipelines, but I picked apache because I think it converts most or all of it. The output is the filter section of a logstash pipeline. Add input and output to compete it. Now you could change the filebeat above to output to this logstash pipeine and the results should be equivalent.

We have more experience in logstash and I find it easier to test and develop. We have systems with some logs that don't have filebeat modules, so logstash gives us a common frontend.

Recently, we encountered an ingest pipeline that refused to convert to logstash. I added the module ingest pipeline to elastic, added the pipeline option to the logstash output and let elastic parse the logs for that module.

There are pros/cons to either method. Beats is pretty good at HA, if it can't send logs, it will when it can (up to ignore_older) and it will round-robin / fail over over a list of elastic targets. However, logstash still has a role for things like syslog protocol and persistent queues.

My answer got longer than your question...

hueyg · October 15, 2019, 7:23pm

Thanks for the reply but I guess I am still not clear on the pros and cons of a pipeline vs straight output from filebeat? Regardless of the destination.

rugenl · October 17, 2019, 3:53am

One example is adding geoip info.

system · November 14, 2019, 3:53am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Beats to Elasticsearch or Beats to Logstash? Which is best practice Beats filebeat	3	471	April 10, 2018
Logstash like forwarder of beats. Why? Benefits? Logstash	5	562	September 1, 2019
How to parse Apache log : Logstash plugin or Filebeat module? Logstash beats-module	3	496	August 23, 2019
When will Beats Modules support logstash output? Beats	4	387	March 30, 2018
Using Filebeat Modules with Logstash and Differentiating Indices in Elasticsearch Beats beats-module	5	227	February 16, 2024

Shipping to Logstash vs Elasticsearch, and pipelines. Also modules?

Related topics