Filebeat 'dissect' won't override 'message' field

Hi there!

I'm having a strange problem when using a dissect clause into filebeat.

Here is my code:

  - dissect:
      when:
        regexp:
          message: '^.*\s#remotelogmessage#\shost_ip:.*$'
      tokenizer: "%{[message]} #remotelogmessage# host_ip:%{[host][ip]}"
      field: "message"
      overwrite_keys: true
      target_prefix: ""

For an unknown reason, even if I specify overwrite_keys: true, the message is not being overwritten.

When using it with the above code, I'm having this in the message field:
[INFO][pulp.agent.b3421482-750d-4125-8226-d5e52ea060ca] gofer.messaging.adapter.connect:30 - connected: proton+amqps://satellite6.sti.usherbrooke.ca:5647 #remotelogmessage#

I was able to confirm this by using another field name in the tokenizer settings. When replacing the tokenizer field by grostata instead of message like:

  - dissect:
      when:
        regexp:
          message: '^.*\s#remotelogmessage#\shost_ip:.*$'
      tokenizer: "%{[grostata]} #remotelogmessage# host_ip:%{[host][ip]}"
      field: "message"
      overwrite_keys: true
      target_prefix: ""

The field grostata will contain exactly what I want:
[INFO][pulp.agent.b3421482-750d-4125-8226-d5e52ea060ca] gofer.messaging.adapter.connect:30 - connected: proton+amqps://satellite6.sti.usherbrooke.ca:5647

...which is removing a special tag I'm using for other purpose (#remotelogmessage#).

So, this proves that my regex is working fine. At first, I thought it was wrong, but I later noticed that the key wasn’t being overwritten.

As soon as I put back message instead of grostata, the tag #remotelogmessage# will be present into message field.

Is there anyone have an idea on this issue?

Thank you all and Best Regards,
Yanick

Dear all,

I found something interesting, but I don't have any explanation for the behaviour.

I was able to make the field message being overwritten. But to explain this correctly, I need to share more parts of my filebeat.yml configuration.

Here is a section for a specific filestream:

######################
# STI LOGS - MESSAGES
######################
- type: filestream

  # Change to true to enable this input configuration.
  enabled: true

  id: "sti-messages"

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /var/log/remote/*-messages.log

  tags: ["STI","system","remotelogmessage"]

  index: 'filebeat-sti-{{ beat_index_version }}-sys-linux'

  pipeline: filebeat-{{ beat_index_version }}-system-syslog-pipeline

  processors:
  - dissect:
      when:
        regexp:
          message: '^.*\s#remotelogmessage#\shost_ip:.*$'
      tokenizer: "%{[message]} #remotelogmessage# host_ip:%{[host][ip]}"
      field: "message"
      overwrite_keys: true
      target_prefix: ""

######################
# ANOTHER FILESTREAM SECTION STARTS BELOW
######################

The processor defined here shold only applied to this filestream section.

However, later in the configuration file, I have a "global" section, that should apply to all filestream I've defined above. Here is that section:

#####################
# Common parameters
#####################
tags: ["rsyslog"]

processors:
  - add_locale: ~

  - add_fields:
      target: ''
      fields:
        log_source: rsyslog
        client_name: sti  

  # Ce dissect sert à retirer le champs srcLogFile= qui a été ajouté par la configuration de rsyslog sur le client. 
  # Pour éviter des problèmes avec l'ingest pipeline, il faut retirer ce champs avant que la ligne arrive au logstash.
  - dissect:
      when:
        regexp:
          message: '^.*\ssrcLogFile=.*$'
      tokenizer: "%{} srcLogfile=%{srcLogfile}"  

  - dissect:
      when:
        regexp:
          message: '^.*\ssrcLogFile=.*$'
      tokenizer: "%{[message]} srcLogFile=%{[host][log][file][path]}"
      overwrite_keys: true
      target_prefix: ""

  - dissect:
      when:
        regexp:
          message: '^.*\shost_ip:.*$'
      tokenizer: "%{[message]} host_ip:%{[host][ip]}"
      field: "message"
      overwrite_keys: true
      target_prefix: ""

If I comment the last dissect:

 # - dissect:
 #    when:
 #     regexp:
 #      message: '^.*\shost_ip:.*$'
 #    tokenizer: "%{[message]} host_ip:%{[host][ip]}"
 #    field: "message"
 #    overwrite_keys: true
 #    target_prefix: ""

then message field will be properly overwritten. The tag #remotelogmessage# will be removed.

Are those two dissect filters overwriting each other? Is there any specific order in which the processors are executed? Is the first dissect executing normally, but the change to the message field has not yet been committed, so the second dissect processes the original value?

The original input line looks like this:
Feb 24 18:04:18 kibana-prod01.domain.com kibana[1664090]: at TaskManagerRunner.run (/usr/share/kibana/node_modules/@kbn/task-manager-plugin/server/task_running/task_runner.js:309:22) #remotelogmessage# host_ip:10.132.204.99

So the #remotelogmessage# has been stripped off, then the host_ip has been mapped to [host][ip] field and also stripped off the message field.

But if I remove the commented dissect section above, the tag #remotelogmessage# won't be stripped off the message field.

I understand it may be hard to figured out what's wrong here, but if you have an idea to share, please do so!

Regards,
Yanick