Filebeat / Haproxy : Grok in pipeline need to have both request and response headers captured to parse them

lpoujol · March 27, 2023, 4:10pm

Hi

I've been testing Filebeat (8.6.2) to collect logs generated by HAProxy 2.2. The logs are directly sent to Elasticsearch, and treated by the haproxy pipeline setup by Filebeat.

In my Haproxy setup, I only capture request header (http host). No response header captured.

Sadly, it seems the GROK pattern in there require that some headers are captured in both request and response as they are in the same ()? group.

The original Grok rule deployed by Filebeat :

(%{NOTSPACE:process.name}\[%{NUMBER:process.pid:long}\]: )?(%{IP:source.address}|-):%{NUMBER:source.port:long} \[%{NOTSPACE:haproxy.request_date}\] %{NOTSPACE:haproxy.frontend_name} %{NOTSPACE:haproxy.backend_name}/%{NOTSPACE:haproxy.server_name} (%{IPORHOST:destination.address} )?%{NUMBER:haproxy.http.request.time_wait_ms:long}/%{NUMBER:haproxy.total_waiting_time_ms:long}/%{NUMBER:haproxy.connection_wait_time_ms:long}/%{NUMBER:haproxy.http.request.time_wait_without_data_ms:long}/%{NUMBER:temp.duration:long} %{NUMBER:http.response.status_code:long} %{NUMBER:haproxy.bytes_read:long} %{NOTSPACE:haproxy.http.request.captured_cookie} %{NOTSPACE:haproxy.http.response.captured_cookie} %{NOTSPACE:haproxy.termination_state} %{NUMBER:haproxy.connections.active:long}/%{NUMBER:haproxy.connections.frontend:long}/%{NUMBER:haproxy.connections.backend:long}/%{NUMBER:haproxy.connections.server:long}/%{NUMBER:haproxy.connections.retries:long} %{NUMBER:haproxy.server_queue:long}/%{NUMBER:haproxy.backend_queue:long} (\{%{DATA:haproxy.http.request.captured_headers}\} \{%{DATA:haproxy.http.response.captured_headers}\} |\{%{DATA}\} )?"%{GREEDYDATA:haproxy.http.request.raw_request_line}"

Example haproxy log line :

Mar 27 15:50:56 hostname haproxy[26830]: 192.0.2.42:50938 [27/Mar/2023:15:50:56.108] external my-backend/my-server 0/0/8/3/11 200 419 - - ---- 1/1/0/0/0 0/0 {example.net} "GET / HTTP/1.1"

Would there be a way to work with only request xor response header captured ?

Maybe we could split that ()? group in two. But, in that case, if you chose to only capture some response headers.... they would be labeled as request header as we seems to have no way of knowing whether its a request or response header. So... not a good thing.

An alternative would be :

If headers are captured in both request and response, we can properly identify them, and label them
If headers are captured in request XOR response, then we could label them in a neutral way, as we can't know whether it's a request or response header.

I don't have a good solution on that... maybe you may have better ideas.

But that would be cool that headers captured only in request or response are parsed, and not thrown away.

Good day

system · April 24, 2023, 6:10pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Haproxy.source not being logged Beats filebeat	2	409	May 8, 2019
Haproxy + TCP + Provided Grok expressions do not match field value Beats filebeat	1	404	February 22, 2023
Custom haproxy module grok pattern failing Beats filebeat	2	1343	April 22, 2019
[HAproxy] log pipeline fails to extract http.request.method for HTTP/2.0 requests Beats filebeat	2	460	August 3, 2022
Haproxy module with master version Beats filebeat	3	976	October 11, 2018

Filebeat / Haproxy : Grok in pipeline need to have both request and response headers captured to parse them

Related topics