Haproxy format changes when sending directly

Solved

I was able to fix it by adding a custom pattern, if you are having the same issue follow these steps -

Create a directory (if it doesn't exists) /etc/logstash/conf.d/patterns

Create file test.txt in /etc/logstash/conf.d/patterns

Open the test.txt file and paste this pattern -

## These patterns were tested w/ haproxy-1.4.15

## Documentation of the haproxy log formats can be found at the following links:
## http://code.google.com/p/haproxy-docs/wiki/HTTPLogFormat
## http://code.google.com/p/haproxy-docs/wiki/TCPLogFormat

HAPROXYTIME (?!<[0-9])%{HOUR:haproxy_hour}:%{MINUTE:haproxy_minute}(?::%{SECOND:haproxy_second})(?![0-9])
HAPROXYDATE %{MONTHDAY:haproxy_monthday}/%{MONTH:haproxy_month}/%{YEAR:haproxy_year}:%{HAPROXYTIME:haproxy_time}.%{INT:haproxy_milliseconds}

# Override these default patterns to parse out what is captured in your haproxy.cfg
HAPROXYCAPTUREDREQUESTHEADERS %{DATA:captured_request_headers}
HAPROXYCAPTUREDRESPONSEHEADERS %{DATA:captured_response_headers}

# Example:
#  These haproxy config lines will add data to the logs that are captured
#  by the patterns below. Place them in your custom patterns directory to
#  override the defaults.
#
#  capture request header Host len 40
#  capture request header X-Forwarded-For len 50
#  capture request header Accept-Language len 50
#  capture request header Referer len 200
#  capture request header User-Agent len 200
#
#  capture response header Content-Type len 30
#  capture response header Content-Encoding len 10
#  capture response header Cache-Control len 200
#  capture response header Last-Modified len 200
#
# HAPROXYCAPTUREDREQUESTHEADERS %{DATA:request_header_host}\|%{DATA:request_header_x_forwarded_for}\|%{DATA:request_header_accept_language}\|%{DATA:request_header_referer}\|%{DATA:request_header_user_agent}
# HAPROXYCAPTUREDRESPONSEHEADERS %{DATA:response_header_content_type}\|%{DATA:response_header_content_encoding}\|%{DATA:response_header_cache_control}\|%{DATA:response_header_last_modified}

# parse a haproxy 'httplog' line
HAPROXYHTTPBASE %{IP:client_ip}:%{INT:client_port} \[%{HAPROXYDATE:accept_date}\] %{NOTSPACE:frontend_name} %{NOTSPACE:backend_name}/%{NOTSPACE:server_name} %{INT:time_request}/%{INT:time_queue}/%{INT:time_backend_connect}/%{INT:time_backend_response}/%{NOTSPACE:time_duration} %{INT:http_status_code} %{NOTSPACE:bytes_read} %{DATA:captured_request_cookie} %{DATA:captured_response_cookie} %{NOTSPACE:termination_state} %{INT:actconn}/%{INT:feconn}/%{INT:beconn}/%{INT:srvconn}/%{NOTSPACE:retries} %{INT:srv_queue}/%{INT:backend_queue} (\{%{HAPROXYCAPTUREDREQUESTHEADERS}\})?( )?(\{%{HAPROXYCAPTUREDRESPONSEHEADERS}\})?( )?"(<BADREQ>|(%{WORD:http_verb} (%{URIPROTO:http_proto}://)?(?:%{USER:http_user}(?::[^@]*)?@)?(?:%{URIHOST:http_host})?(?:%{URIPATHPARAM:http_request})?( HTTP/%{NUMBER:http_version})?))?"?

HAPROXYHTTP (?:%{SYSLOGTIMESTAMP:syslog_timestamp}|%{TIMESTAMP_ISO8601:timestamp8601})  %{SYSLOGPROG}: %{HAPROXYHTTPBASE}

# parse a haproxy 'tcplog' line
HAPROXYTCP (?:%{SYSLOGTIMESTAMP:syslog_timestamp}|%{TIMESTAMP_ISO8601:timestamp8601}) %{IPORHOST:syslog_server} %{SYSLOGPROG}: %{IP:client_ip}:%{INT:client_port} \[%{HAPROXYDATE:accept_date}\] %{NOTSPACE:frontend_name} %{NOTSPACE:backend_name}/%{NOTSPACE:server_name} %{INT:time_queue}/%{INT:time_backend_connect}/%{NOTSPACE:time_duration} %{NOTSPACE:bytes_read} %{NOTSPACE:termination_state} %{INT:actconn}/%{INT:feconn}/%{INT:beconn}/%{INT:srvconn}/%{NOTSPACE:retries} %{INT:srv_queue}/%{INT:backend_queue}

It's the default file just with '%{IPORHOST:syslog_server}' removed from HAPROXYHTTP pattern.

In file /etc/logstash/conf.d/logstash.conf add this filter -
filter {
grok {
patterns_dir => "/etc/logstash/conf.d/patterns"
match => ["message", "%{SYSLOG5424PRI}%{HAPROXYHTTP}"]
}
}

P.s For this to work your haproxy.cfg must have these additional lines -

  capture request header Host len 40
  capture request header X-Forwarded-For len 50
  capture request header Accept-Language len 50
  capture request header Referer len 200
  capture request header User-Agent len 200

  capture response header Content-Type len 30
  capture response header Content-Encoding len 10
  capture response header Cache-Control len 200
  capture response header Last-Modified len 200