A Little guidance please

Hello, and first of all I would like to say that your product is looking awesome , and pretty noob friendly ( having in mind that I managed to sort of set it up in one day).

I am mainly looking to filtering Haproxy logs, and It would be amazing if someone could give a little guidance on what is needed to achieve that, because each and every tutorial I have followed is different, but not exactly what I'm looking for.

Now I have a debian server running haproxy, and installed filebeat 6.6.1 with haproxy module enabled.
I have singed up for elastic cloud, and added all the needed information to configs, so I can see the logs coming in on my elastic cloud dashboard.

The 'messages' I am getting are the exact rows from the log file f.x-
Mar 9 11:10:38 proksikas haproxy[23079]: proksikas CONNECT i.instagram.com:443 i.instagram.com:443 200 0 54 1552147836 1712 558 192.168.1.2 45964 43248 34.259.125.134 CONNECT i.instagram.com:443 HTTP/1.1 7248 192.168.1.101

What I am looking for is to log each of the columns separately, from what I could understand logstash could possibly do that with regex?

Sorry about the confusion, doing my best at reading more about ELK stack possibilities!

Note: I'm going to move this post to the filebeat forum, because I think your question is really about that product than Elasticsearch.

If you are using the HAProxy module then you almost certainly already have separate fields per this doc:

I suspect that the issue is simply to do with way you have been looking at your data so far.

Firstly, you should have a sample dashboard already as part of the beat setup.

If that's there and is showing graphs of response codes (etc) then that's a pretty good sign that everything is working correctly.

If you want to get face-to-face with your data, then try the Kibana Console. Jump over there and then perform a _search (docs) on the index you're using for your HAProxy data. It should show you 10 documents (the default limit), and all their fields.

Thank you for the response, and as strange as this might sound, after leaving it over night I actually started seeing the structured data without making any additional changes, (time range was always set to last 5 years, so I wasn't just missing the traffic).

Referring to the * haproxy fields | Filebeat Reference [6.6] | Elastic
There should be a haproxy.destination.port that i'm mainly interested in, but it's not showing in the discover tab -

Is there something additional I need to configure?
I am not using a custom log format.
Here's my haproxy configuration -

global

log /dev/log    local0
log /dev/log    local1 notice
log 127.0.0.1:9001 local0

maxconn 55550
maxcompcpuusage 100
maxcomprate 0
nbproc 1
ssl-server-verify none
#stats socket /var/run/haproxy.sock mode 777 level admin
stats socket /var/lib/haproxy/stats
daemon

defaults
retries 2
timeout connect 30s
timeout client 30s
timeout server 30s
timeout check 50000
timeout http-keep-alive 5000
timeout http-request 50000
option http-server-close

listen PRIEKIS
mode http
option httplog
option dontlognull
log global
bind *:1111-1112

/etc/filebeat/modules/haproxy.yml configuration-

- module: haproxy
  log:
    enabled: true
    var.paths: ["/var/log/haproxy.log"]
    var.input: "file"

/etc/filebeat/filebeat.yml configuration -

#=========================== Filebeat inputs =============================

filebeat.inputs:

# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.

- type: log

  # Change to true to enable this input configuration.
  enabled: false

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
   # - /var/log/*.log
    #- c:\programdata\elasticsearch\logs\*
#     - /var/log/haproxy.log

filebeat modules list -

root@kobra:/home/kobra# filebeat modules list
Enabled:
elasticsearch
haproxy
kibana
system

Disabled:
apache2
auditd
icinga
iis
kafka
logstash
mongodb
mysql
nginx
osquery
postgresql
redis
suricata
traefik

Thank you ! :hugs:

Depending on haproxy version and log format the destination.port field might be available or not. The haproxy parses the port only if available. Do you have some sample logs (please redact IPs)?

1 Like

There is indeed no such data in default http log format , and I just realized that I am actually not looking for destination.port , but frontend_port (the one that haproxy is listening on)
Is there a way to edit the module to work with custom log format?
Thank you

Mar 11 10:04:54 proksikas haproxy[20727]: 73.189.138.13:61720 [11/Mar/2019:10:02:25.861] Bite Bite/ss142 0/0/0/44/149050 200 133092 - - cD-- 577/577/572/2/0 0/0 "CONNECT i.instagram.com:443 HTTP/1.1"

Sure. You find the parser configurations in module/haproxy/log/ingest/default.json. You can modify it at will and install the updates version with filebeat setup --pipelines --module haproxy afterwards.
If you look at the file you will find 4 predefined grok patterns (regex used to parse the line). You can just add your custom one to the list. Also see grok processor docs.

1 Like

Amazing, thank you!

Just one more thing, in my current filebeat installation there's no such folder as 'module'

root@proksikas:/etc/filebeat# ls
fields.yml  filebeat.reference.yml  filebeat.yml  modules.d

I suppose I should get the source from github? Sorry about these noob questions )

Depending how you installed filebeat they might be in another location.

In case you have used the deb or rpm package, you will find that file $(which filebeat) will report it is a script. You can find some of the locations used in your system in this script file. You should be able to find the definitions in /usr/share/filebeat.
You can configure a custom directory for module definitions in filebeat.yml.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.