Hello, and first of all I would like to say that your product is looking awesome , and pretty noob friendly ( having in mind that I managed to sort of set it up in one day).
I am mainly looking to filtering Haproxy logs, and It would be amazing if someone could give a little guidance on what is needed to achieve that, because each and every tutorial I have followed is different, but not exactly what I'm looking for.
Now I have a debian server running haproxy, and installed filebeat 6.6.1 with haproxy module enabled.
I have singed up for elastic cloud, and added all the needed information to configs, so I can see the logs coming in on my elastic cloud dashboard.
The 'messages' I am getting are the exact rows from the log file f.x- Mar 9 11:10:38 proksikas haproxy[23079]: proksikas CONNECT i.instagram.com:443 i.instagram.com:443 200 0 54 1552147836 1712 558 192.168.1.2 45964 43248 34.259.125.134 CONNECT i.instagram.com:443 HTTP/1.1 7248 192.168.1.101
What I am looking for is to log each of the columns separately, from what I could understand logstash could possibly do that with regex?
Sorry about the confusion, doing my best at reading more about ELK stack possibilities!
If that's there and is showing graphs of response codes (etc) then that's a pretty good sign that everything is working correctly.
If you want to get face-to-face with your data, then try the Kibana Console. Jump over there and then perform a _search (docs) on the index you're using for your HAProxy data. It should show you 10 documents (the default limit), and all their fields.
Thank you for the response, and as strange as this might sound, after leaving it over night I actually started seeing the structured data without making any additional changes, (time range was always set to last 5 years, so I wasn't just missing the traffic).
#=========================== Filebeat inputs =============================
filebeat.inputs:
# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.
- type: log
# Change to true to enable this input configuration.
enabled: false
# Paths that should be crawled and fetched. Glob based paths.
paths:
# - /var/log/*.log
#- c:\programdata\elasticsearch\logs\*
# - /var/log/haproxy.log
filebeat modules list -
root@kobra:/home/kobra# filebeat modules list
Enabled:
elasticsearch
haproxy
kibana
system
Disabled:
apache2
auditd
icinga
iis
kafka
logstash
mongodb
mysql
nginx
osquery
postgresql
redis
suricata
traefik
Depending on haproxy version and log format the destination.port field might be available or not. The haproxy parses the port only if available. Do you have some sample logs (please redact IPs)?
There is indeed no such data in default http log format , and I just realized that I am actually not looking for destination.port , but frontend_port (the one that haproxy is listening on)
Is there a way to edit the module to work with custom log format?
Thank you
Sure. You find the parser configurations in module/haproxy/log/ingest/default.json. You can modify it at will and install the updates version with filebeat setup --pipelines --module haproxy afterwards.
If you look at the file you will find 4 predefined grok patterns (regex used to parse the line). You can just add your custom one to the list. Also see grok processor docs.
Depending how you installed filebeat they might be in another location.
In case you have used the deb or rpm package, you will find that file $(which filebeat) will report it is a script. You can find some of the locations used in your system in this script file. You should be able to find the definitions in /usr/share/filebeat.
You can configure a custom directory for module definitions in filebeat.yml.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.