Logstash battle scars - metricbeat config and duplicate output

Not so much a question but just wanted to share some things to look out for.

I should start by saying that overall by experience of ELK is good and I'm certainly sticking with it.
Just wanted to share a few details of things that I discovered the hard way.
Maybe these are already well known, maybe I should have read the docs more carefully..

Firstly - be aware that if you use the "system" module in metricbeat and just go with the default config - it outputs a LOT of data
By default I was outputting the main metricsets every 10 seconds
My indices were filling up with gigabytes of data every day.
That created all sorts of problems for my ELK stack trying to cope with the volumes of data and it was only somewhat later on that I eventually tracked down this to be the cause.
I've changed the "period" value to be several minutes which is more that adequate for my purposes.

On a related matter
If you use beats (as I am doing) you may also end up, as I did, with a logstash config that basically duplicates your data
Following the docs for setting up beats with Logstash, I had a filebeat.conf file with an input section using the "beats" port, a filter section to process the data and an output section.

the output section contains a line which looks like the below
index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"

This line determines the name of the index that the beats data can be found in

Not knowing any better, I also had a logstash.conf file with an input section using the standard logstash port and a simple output section.

This was all working well - I was getting a "filebeat-" and a "metricbeat-" index in Kibana with all my data
what I didn't fully appreciate was that I was also getting a "logstash-*" index which contained an exact duplicate copy of all the beats data
This could be found through the console when you looked for it but it was essentially invisible.

Because I'm only interested in the beats data, the solution I came up with was to comment out the "output" section in the main logstash conf file
This has now stopped sending the duplicate logstash index, but does, correctly, keep sending the beats data as required.
I was also able to safely delete the backlog of logstash indices knowing that they were unnecessary duplicates.

I hope that someone finds this useful.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.