Filebeat sends malformed logs

Hi,

i'm using filebeat to receive logs from a Juniper firewall and forward them to logstash which then sends them to elasticsearch.

i configured my juniper firewall to use the required log format structured-data + brief as mentioned here.

i can see the logs coming in with tcpdump and they seem to be in the correct format

root@elastic04:~# tcpdump dst port 514 -n -v
tcpdump: listening on eno0, link-type EN10MB (Ethernet), capture size 262144 bytes
16:33:33.502005 IP (tos 0x0, ttl 64, id 31032, offset 0, flags [none], proto UDP (17), length 544)
    x.x.x.x.514 > 10.11.200.40.514: SYSLOG, length: 516
	Facility user (1), Severity info (6)
	Msg: 1 2023-08-31T12:47:35.693-04:00 PNI-FW1 RT_FLOW - RT_FLOW_SESSION_DENY [junos@2636.1.1.1.2.26 source-address="89.117.77.147" source-port="44122" destination-address="x.x.x.x" destination-port="2080" connection-tag="0" service-name="None" protocol-id="6" icmp-type="0" policy-name="inbound_clean_up" source-zone-name="I2" destination-zone-name="XXX" application="UNKNOWN" nested-application="UNKNOWN" username="N/A" roles="N/A" packet-incoming-interface="et-1/1/0.0" encrypted="UNKNOWN" reason="policy deny"]
16:33:33.523229 IP (tos 0x0, ttl 64, id 31038, offset 0, flags [none], proto UDP (17), length 545)
    x.x.x.x.514 > 10.11.200.40.514: SYSLOG, length: 517
	Facility user (1), Severity info (6)
	Msg: 1 2023-08-31T12:47:35.715-04:00 PNI-FW1 RT_FLOW - RT_FLOW_SESSION_DENY [junos@2636.1.1.1.2.26 source-address="79.124.60.206" source-port="55541" destination-address="x.x.x.x" destination-port="48755" connection-tag="0" service-name="None" protocol-id="6" icmp-type="0" policy-name="inbound_clean_up" source-zone-name="I2" destination-zone-name="XXX" application="UNKNOWN" nested-application="UNKNOWN" username="N/A" roles="N/A" packet-incoming-interface="et-1/1/0.0" encrypted="UNKNOWN" reason="policy deny"]

However, when they make it to Elasticsearch they are malformed and they look like this

2023-08-31T16:38:57.287216946Z %{host} ]\xD5Z\x8B\xAC\xA9\x8BRp\xB4y\t\x87\xC3,\x9C\xFF_,\xDE|\xF7\xF7\t\xE7\xBF\xE0\xA7\u001D\xB5=N\u0015\u007F\xDCkb\x9E\xBEP\xFF\xBCʾ\x95\xB6g\xDFR\xDBE\xF9\x9A\xB6\xFB\xFE\xCFh{\u0012\xC7|\xB8F\x9B\xBFR\xE7\xBF.\xA2\xAF\xE8C|\x9C(\xEC/L\xAFIƃ>\u001C\xCF\xCCßL\xFFAr\xF1\x87\xF5!J\xCD\u007F\a\xE5+\a\xE5p\xF8e\u0000\xADw\u0012\x83

2023-08-31T16:39:30.055081833Z %{host} \x9EK\x9C^\xE8\x8B\xE2\tj\xD7\xF1\xA8\xDE\e\xDD\xCEAN]d\xCEԨUjt\u001A\u001A\xBA\xC1AL]\xDB=\xFF\xB0\xD4\xC8.\xCC\xED\xE3\x94~\xBD)\xF0\x91w\xE5F\u0013\xECCb\x85\xA2\xC8l\x87\xF4wۅ\x84:\x83`\xA6\u007FI\xA4*C

i enabled the juniper module on filebeat and this is how my filebeat config looks like

# ============================== Filebeat inputs ===============================

filebeat.inputs:



# ============================== Filebeat modules ==============================

filebeat.config.modules:
  # Glob pattern for configuration loading
  path: ${path.config}/modules.d/*.yml

  # Set to true to enable config reloading
  reload.enabled: false

  # Period on which files under path should be checked for changes
  #reload.period: 10s

# ======================= Elasticsearch template setting =======================

setup.template.settings:
  index.number_of_shards: 1

# =================================== Kibana ===================================

# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:

  # Kibana Host
  # Scheme and port can be left out and will be set to the default (http and 5601)
  # In case you specify and additional path, the scheme is required: http://localhost:5601/path
  # IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
  host: "192.168.39.30:5601"
  username: "elastic"  
  password: "password"
  # Kibana Space ID
  # ID of the Kibana Space into which the dashboards should be loaded. By default,
  # the Default Space will be used.
  #space.id:

# ================================== Outputs ===================================

# ------------------------------ Logstash Output -------------------------------

output.logstash:
  # The Logstash hosts
  hosts: ["192.168.40.23:5145"]

and this is the juniper module config

# Module: juniper
# Docs: https://www.elastic.co/guide/en/beats/filebeat/8.6/filebeat-module-juniper.html

- module: juniper
  junos:
    enabled: true

    # Set which input to use between udp (default), tcp or file.
    var.input: udp
    var.syslog_host: 10.11.200.40
    var.syslog_port: 514

    # Set paths for the log files when file input is used.
    # var.paths:

    # Toggle output of non-ECS fields (default true).
    # var.rsa_fields: true

    # Set custom timezone offset.
    # "local" (default) for system timezone.
    # "+02:00" for GMT+02:00
    # var.tz_offset: local

i'm also seeing some errors in the filebeat logs

root@elastic04:~# journalctl -u filebeat.service -xn | less
hed":7,"retry":334,"total":7}}},"registrar":{"states":{"current":0}},"system":{"load":{"1":0.17,"15":0.18,"5":0.22,"norm":{"1":0.003,"15":0.0032,"5":0.0039}}}},"ecs.version":"1.6.0"}}
Aug 31 16:50:27 elastic04 filebeat[789860]: {"log.level":"error","@timestamp":"2023-08-31T16:50:27.748Z","log.logger":"logstash","log.origin":{"file.name":"logstash/async.go","file.line":280},"message":"Failed to publish events caused by: read tcp 192.168.40.1:39786->192.168.40.23:5145: i/o timeout","service.name":"filebeat","ecs.version":"1.6.0"}
Aug 31 16:50:27 elastic04 filebeat[789860]: {"log.level":"error","@timestamp":"2023-08-31T16:50:27.749Z","log.logger":"logstash","log.origin":{"file.name":"logstash/async.go","file.line":280},"message":"Failed to publish events caused by: read tcp 192.168.40.1:39786->192.168.40.23:5145: i/o timeout","service.name":"filebeat","ecs.version":"1.6.0"}
Aug 31 16:50:27 elastic04 filebeat[789860]: {"log.level":"error","@timestamp":"2023-08-31T16:50:27.749Z","log.logger":"logstash","log.origin":{"file.name":"logstash/async.go","file.line":280},"message":"Failed to publish events caused by: read tcp 192.168.40.1:39786->192.168.40.23:5145: i/o timeout","service.name":"filebeat","ecs.version":"1.6.0"}
Aug 31 16:50:27 elastic04 filebeat[789860]: {"log.level":"error","@timestamp":"2023-08-31T16:50:27.750Z","log.logger":"logstash","log.origin":{"file.name":"logstash/async.go","file.line":280},"message":"Failed to publish events caused by: client is not connected","service.name":"filebeat","ecs.version":"1.6.0"}
Aug 31 16:50:28 elastic04 filebeat[789860]: {"log.level":"error","@timestamp":"2023-08-31T16:50:28.871Z","log.logger":"publisher_pipeline_output","log.origin":{"file.name":"pipeline/client_worker.go","file.line":176},"message":"failed to publish events: client is not connected","service.name":"filebeat","ecs.version":"1.6.0"}
Aug 31 16:50:28 elastic04 filebeat[789860]: {"log.level":"info","@timestamp":"2023-08-31T16:50:28.871Z","log.logger":"publisher_pipeline_output","log.origin":{"file.name":"pipeline/client_worker.go","file.line":139},"message":"Connecting to backoff(async(tcp://192.168.40.23:5145))","service.name":"filebeat","ecs.version":"1.6.0"}
Aug 31 16:50:28 elastic04 filebeat[789860]: {"log.level":"info","@timestamp":"2023-08-31T16:50:28.871Z","log.logger":"publisher_pipeline_output","log.origin":{"file.name":"pipeline/client_worker.go","file.line":147},"message":"Connection to backoff(async(tcp://192.168.40.23:5145)) established","service.name":"filebeat","ecs.version":"1.6.0"}

@MheniMerz Would you be able to provide the following as well:

  1. Logstash output config in filebeat.yml
  2. Logstash input config on logstash.yml
  3. Any logstash filters/pipelines you might have, that also includes the ES output.

Looking at the weird format, its usually when the traffic is encrypted, but you are using UDP and not TCP (so no TLS), which usually then means there might be a misconfiguration that messes with the data in transit (between filebeat and ES).
Since your logs include a Logstash output, I presume there is a logstash in between? :slight_smile:

Might also be good to mention that if you would be open to using Elastic Agent and their Integrations, there are some brand new Juniper ones, as the one you are currently using is fairly old I believe.

1 Like

@Marius_Iversen Thanks for the reply.

my pipeline looks like the following,

Juniper_Firewalls -> filebeat[10.11.200.40] -> logstash01[192.168.40.23] -> kafka[192.168.40.211] -> logstash02[192.168.40.136] -> elasticsearch[192.168.37.209]

i didn't have filebeat before so the firewall was sending directly to logstash, but the whole message would all be in one text field which is why i tried to add filebeat to automatically parse the logs into the appropriate field types.

i haven't experimented with the elastic agent yet, can it replace filebeat in this pipeline? if not where does it fit in?

here is how the filebeat.yml output section looks like (the whole file is included in the question just scroll to see the rest).

this is using TCP (i tried with UDP but i don't think filebeat output supports it)
the input to filebeat is UDP on port 514 using the juniper module but the output to logstash is TCP

# ------------------------------ Logstash Output -------------------------------

output.logstash:
  # The Logstash hosts
  hosts: ["192.168.40.23:5145"]

the logstash01 config looks like the following

input {
....
  tcp {
    host => "192.168.40.23"
    port => 5145
    type => "syslog"
  }
}

filter {}

output {
....
  if [type] == "syslog"{
    kafka {
      bootstrap_servers => "192.168.40.211:9092"
      topic_id => "junos-fw"
    }
  }
}

and this is the logstash02 config
i'm not using any logstash filters for now, and i do have ssl on my Elasticsearch as you see in the output section

input {
  kafka {
    bootstrap_servers => "192.168.40.211:9092"
    topics => ["junos-fw"]
  }
}

filter {

}

output {
  elasticsearch {
    hosts => ["https://192.168.37.209"]
    index => "junos-fw-%{+YYYY.MM.dd}"
    user => "elastic"
    password => "password"
    ssl => true
    cacert => "/etc/logstash/http_ca.crt"
  }
}

Ah then I can see whats wrong.
To solve your current problem, the input on the logstash01 config needs to be another input specific for beats:

This explains your weirdly formatted data.

Sorry for dropping a lot of information on you, feel free to ignore the below, but I wanted to add it here just in case you were interested.

Just to add a few things, since you did not use Elastic Agent yet, the filebeat module is an older experimental version, that was later replaced, and usually the parsing of the data does not happen on filebeat, but rather on elasticsearch (though in this very specific case, the parsing happens on the filebeat).

Usually with filebeat modules, the parsing happens on elasticsearch (which is why its required to run filebeat setup once, with an elasticsearch output configured, to install these pipelines and other required things).

When running elastic agent, you don't have to run any filebeat setup or anything else, its Kibana that handles the installation and upgrades of all you need for the integrations (like Juniper).

There is however a way to test this, usually when you use filebeat to send data to Logstash, and then to ES, the logstash output config needs to specify the elasticsearch pipeline to parse the data that filebeat installed during the setup phase, its similar with Elastic Integrations.

If you go to the Kibana side menu, click integrations, choose Juniper SRX (not the deprecated Juniper one), click settings and "install assets", it will install the ingest pipelines used to parse your data, its usually dependant on the name and version of the integration.

Now you can either go back to sending the data to Logstash directly, or disable the filebeat module, and just use a tcp/udp input without any filebeat modules enabled, and send it to logstash.

Then on Logstash02 output, you always need to configure the ingest pipeline to use, in this case and version its named "logs-juniper_srx.log-1.14.1", ref: Elasticsearch output plugin | Logstash Reference [8.9] | Elastic

The rest of the output can be configured the same way as you already do.
Since you have Logstash in between, you also need to add the field mappings to your index/datasteam manually, it was also installed together with the pipeline, its named: logs-juniper_srx.log@package

1 Like

@Marius_Iversen
Thank you that was very helpful.

the integrations method sounds like a cleaner way to do this, i will be exploring that route.
do you know if it requires a license?

when i go to the integrations tab on kibana it keeps loading and it just stuck like that.

Pretty much all integrations are license free (they are free under the basic license, which is what comes by default).

It might be that Kibana does not have access to internet? Integrations works similar to linux package managers (yum, apt etc), so it checks for updates online, see if there are more integrations available, currently around 200+ available.

If you end up going the integration route, without Elastic Agent, just remember that the data going through logstash and kafka needs to stay the same original format when it enters Elasticsearch.

Looking at the files ending with .log (not the ones with expected.json), you can see what sort of formats we expect from SRX/JUNOS logs:

1 Like

Ah thanks, that's probably it.
I believe egress and ingress HTTP and HTTPS traffic is allowed for the Kibana host, but I will double-check.
unless it uses a different protocol to check updates and download the integrations?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.