Field remapping for backward compatibility

davetayl · March 21, 2025, 1:11am

I have an environment that is a bit older, ES7.7.3, LS6.8.23 and FB6.8.23 all working correctly with the older OS versions the client is using. As part of a new build we need to upgrade the FB version to 8.17 if possible, however there are a lot of differences with the fields sent which is causing issues. Is there a way to remap the fields sent by FB8.17 to conform to the older standard?

leandrojmp · March 21, 2025, 2:07am

Can you give some example of what you want to do and are these differences?

Keep in mind that Beats 8.17 is not compatible with neither Logtash 6.8.23 or Elasticsearch 7.7.3, it may work in some cases but it may also have issues.

davetayl · March 21, 2025, 3:40am

Yeah, I'm aware of the incompatibility, however the company seems to feel that it "should just be a matter of remapping some fields"

The answer to your first question is that we are using it for log shipping for system logs.

What I'm seeing are error logs from logstash like the following.

[2025-03-20T21:05:25,584][DEBUG][o.e.a.b.TransportShardBulkAction] [<redacted>] [logstash-2025.03.20-000934][1] failed to execute bulk item (index) index {[logstash][doc][<redacted>], source[{"@version":"1","ecs":{"version":"1.12.0"},"@timestamp":"2025-03-21T00:49:40.896Z","service":{"type":"system"},"log":{"file":{"path":"/var/log/syslog"},"offset":88292325},"agent":{"ephemeral_id":"5f49041f-3707-4d99-b21f-3502af34d7e7","name":"<redacted>","id":"<redacted>","version":"8.17.3","type":"filebeat"},"aws":{"tags":{"roles":"<redacted>","stack":"<redacted>","cluster":"<redacted>","environment":"<redacted>","product":"<redacted>","Name":"<redacted>"}},"cloud":{"machine":{"type":"t3.xlarge"},"service":{"name":"EC2"},"image":{"id":"<redacted>"},"provider":"aws","region":"<redacted>","account":{"id":"<redacted>"},"instance":{"id":"<redacted>"},"availability_zone":"<redacted>"},"fileset":{"name":"syslog"},"event":{"timezone":"+00:00","dataset":"system.syslog","module":"system"},"input":{"type":"log"},"message":"Mar 21 00:49:40 ip-172-18-15-146 consul[353]: 2025-03-21T00:49:40.275Z [WARN]  agent: Check is now critical: check=service:elasticsearch-exporter:2"}]} org.elasticsearch.index.mapper.MapperParsingException: failed to parse field [service] of type [keyword] in document with id 'IzI7tpUBV9sC8fiYxkVP' at org.elasticsearch.index.mapper.FieldMapper.parse(FieldMapper.java:303) ~[elasticsearch-6.8.10.jar:6.8.10]

So the log is making it through Logstash to Elasticsearch but can't be mapped to the index as far as I can determine. The actual log message and all teh info can be seen in the Elasticsearch error.

leandrojmp · March 21, 2025, 4:12am

Do you have a non DEBUG log as well? WARN or ERROR logs for this issue? DEBUG logs can confuse the troubleshooting sometimes.

Looking at the log it seems that you are have mapping conflicts.

org.elasticsearch.index.mapper.MapperParsingException: failed to parse field [service] of type [keyword] in document with id 'IzI7tpUBV9sC8fiYxkVP' at org.elasticsearch.index.mapper.FieldMapper.parse(FieldMapper.java:303) ~[elasticsearch-6.8.10.jar:6.8.10]

But in your document the service field is a json object.

  "service": {
    "type": "system"
  }

It looks like that in the mapping of your index the service field is mapped as a keyword but you are trying to send a json object, this does not work on any version of the stack.

What is adding this field? Can you share your filebeat.yml and your logstash pipeline?

davetayl · March 21, 2025, 4:24am

Those are the only logs we are seeing, and yes that was my understanding too, how and where to do the remapping is the question.

leandrojmp · March 21, 2025, 4:55am

You need to share both your filebeat.yml and your logstash configuration.

I don't think this is added by default neither by Filebeat nor by Logstash.

davetayl · March 21, 2025, 5:01am

I'm basically using the default for testing, there isn't anything special in there at all.

###################### Filebeat Configuration Example #########################

# This file is an example configuration file highlighting only the most common
# options. The filebeat.reference.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.
#
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/filebeat/index.html

# For more available modules and options, please see the filebeat.reference.yml sample
# configuration file.

# ============================== Filebeat inputs ===============================

filebeat.inputs:

# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input-specific configurations.

# filestream is an input for collecting log messages from files.
- type: filestream

  # Unique ID among all inputs, an ID is required.
  id: <redacted>

  # Change to true to enable this input configuration.
  enabled: false

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /var/log/syslog
    #- c:\programdata\elasticsearch\logs\*

  # Exclude lines. A list of regular expressions to match. It drops the lines that are
  # matching any regular expression from the list.
  # Line filtering happens after the parsers pipeline. If you would like to filter lines
  # before parsers, use include_message parser.
  #exclude_lines: ['^DBG']

  # Include lines. A list of regular expressions to match. It exports the lines that are
  # matching any regular expression from the list.
  # Line filtering happens after the parsers pipeline. If you would like to filter lines
  # before parsers, use include_message parser.
  #include_lines: ['^ERR', '^WARN']

  # Exclude files. A list of regular expressions to match. Filebeat drops the files that
  # are matching any regular expression from the list. By default, no files are dropped.
  #prospector.scanner.exclude_files: ['.gz$']

  # Optional additional fields. These fields can be freely picked
  # to add additional information to the crawled log files for filtering
  #fields:
  #  level: debug
  #  review: 1
# ------------------------------ 6.5.23 compatability --------------------------
    # Disable ECS compatibility features
  fields_under_root: true
    
    # Add custom fields in non-ECS format
  fields:
    type: "syslog"  # Instead of event.dataset
    host: "${HOSTNAME}"  # Instead of host.name

# Disable ECS reformatting where possible
setup.ilm.enabled: false
setup.template.enabled: false

# ============================== Filebeat modules ==============================

filebeat.config.modules:
  # Glob pattern for configuration loading
  path: ${path.config}/modules.d/*.yml

  # Set to true to enable config reloading
  reload.enabled: false

  # Period on which files under path should be checked for changes
  #reload.period: 10s

# ======================= Elasticsearch template setting =======================

#setup.template.settings:
  #index.number_of_shards: 1
  #index.codec: best_compression
  #_source.enabled: false


# ================================== General ===================================

# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:

# The tags of the shipper are included in their field with each
# transaction published.
#tags: ["service-X", "web-tier"]

# Optional fields that you can specify to add additional information to the
# output.
#fields:
#  env: staging

# ================================= Dashboards =================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here or by using the `setup` command.
#setup.dashboards.enabled: false

# The URL from where to download the dashboard archive. By default, this URL
# has a value that is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:

# =================================== Kibana ===================================

# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
#setup.kibana:

  # Kibana Host
  # Scheme and port can be left out and will be set to the default (http and 5601)
  # In case you specify and additional path, the scheme is required: http://localhost:5601/path
  # IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
  #host: "localhost:5601"

  # Kibana Space ID
  # ID of the Kibana Space into which the dashboards should be loaded. By default,
  # the Default Space will be used.
  #space.id:

# =============================== Elastic Cloud ================================

# These settings simplify using Filebeat with the Elastic Cloud (https://cloud.elastic.co/).

# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
#cloud.id:

# The cloud.auth setting overwrites the `output.elasticsearch.username` and
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.
#cloud.auth:

# ================================== Outputs ===================================

# Configure what output to use when sending the data collected by the beat.
#output:
# ---------------------------- Elasticsearch Output ----------------------------
#elasticsearch:
  # Array of hosts to connect to.
  # hosts: ["localhost:9200"]

  # Performance preset - one of "balanced", "throughput", "scale",
  # "latency", or "custom".
  # preset: balanced

  # Protocol - either `http` (default) or `https`.
  # protocol: "https"

  # Authentication credentials - either API key or username/password.
  # api_key: "id:api_key"
  # username: "elastic"
  # password: "changeme"

# ------------------------------ Logstash Output -------------------------------
output.logstash:
  # The Logstash hosts
  hosts: ["<redacted>:5044"]

  # Optional SSL. By default is off.
  # List of root certificates for HTTPS server verifications
  #ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]

  # Certificate for SSL client authentication
  #ssl.certificate: "/etc/pki/client/cert.pem"

  # Client Certificate Key
  #ssl.key: "/etc/pki/client/cert.key"

# ------------------------------ File Output -----------------------------------
#output.file:
#    path: "/tmp/filebeat"
#    filename: filebeat 
    
# ================================= Processors =================================
#processors:
  #- add_host_metadata:
  # when.not.contains.tags: forwarded
  #- add_cloud_metadata: ~
  #- add_docker_metadata: ~
  #- add_kubernetes_metadata: ~
  #  - drop_fields:
  #      fields:
  #        - "service"
  #        - "fileset"
  #        - "event"
  #        - "agent"
  #        - "ecs"
  #      ignore_missing: false

# ================================== Logging ===================================

# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
#logging.level: debug

# At debug level, you can selectively enable logging only for some components.
# To enable all selectors, use ["*"]. Examples of other selectors are "beat",
# "publisher", "service".
#logging.selectors: ["*"]

# ============================= X-Pack Monitoring ==============================
# Filebeat can export internal metrics to a central Elasticsearch monitoring
# cluster.  This requires xpack monitoring to be enabled in Elasticsearch.  The
# reporting is disabled by default.

# Set to true to enable the monitoring reporter.
#monitoring.enabled: false

# Sets the UUID of the Elasticsearch cluster under which monitoring data for this
# Filebeat instance will appear in the Stack Monitoring UI. If output.elasticsearch
# is enabled, the UUID is derived from the Elasticsearch cluster referenced by output.elasticsearch.
#monitoring.cluster_uuid:

# Uncomment to send the metrics to Elasticsearch. Most settings from the
# Elasticsearch outputs are accepted here as well.
# Note that the settings should point to your Elasticsearch *monitoring* cluster.
# Any setting that is not set is automatically inherited from the Elasticsearch
# output configuration, so if you have the Elasticsearch output configured such
# that it is pointing to your Elasticsearch monitoring cluster, you can simply
# uncomment the following line.
#monitoring.elasticsearch:

# ============================== Instrumentation ===============================

# Instrumentation support for the filebeat.
#instrumentation:
    # Set to true to enable instrumentation of filebeat.
    #enabled: false

    # Environment in which filebeat is running on (eg: staging, production, etc.)
    #environment: ""

    # APM Server hosts to report instrumentation results to.
    #hosts:
    #  - http://localhost:8200

    # API Key for the APM Server(s).
    # If api_key is set then secret_token will be ignored.
    #api_key:

    # Secret token for the APM Server(s).
    #secret_token:


# ================================= Migration ==================================

# This allows to enable 6.7 migration aliases
#migration.6_to_7.enabled: true

root@ip-172-18-13-174:/etc/logstash/conf.d# cat 100-input-beats.conf 
input {
  beats {
    include_codec_tag => false
    port => 5044
  }
}

leandrojmp · March 21, 2025, 5:08am

You need to share the entire logstash configuration, not just the input.

Also, are you sure that this is the filebeat configuration that is running? It does not match what the event in your log is showing.

In your configuration you have one input of the type filestream, but the event in the log you shared has an input of the type log.

  "input": {
    "type": "log"
  }

Also, the input has enabled: false, this filebeat.yml would not collect anything.

davetayl · March 21, 2025, 5:12am

---
path.data: "/var/lib/logstash"
path.config: "/etc/logstash/conf.d"
path.logs: "/var/log/logstash"
dead_letter_queue.enable: true
log.format: json
log.level: warn
node.name: <redacted>
pipeline.workers: 12
pipeline.batch.size: 250
pipeline.batch.delay: 5000

davetayl · March 21, 2025, 5:15am

That is the default in the config when it is installed.

leandrojmp · March 21, 2025, 5:17am

This is the logstash.yml file, what you need to share are the configuration files, which from your path.config are the files inside /etc/logstash/conf.d.

You shared a input file named 100-input-beats.conf, you need to share all other files in the same path to be possible to understand the changes logstash is doing.

Also, the filebeat configuration you shared is not the one that filebeat is using, the input is different from the one on the log you shared and it is also not enabled, can you validate and share the configuration the filebeat is using to run? It is not the one shared before.

davetayl · March 21, 2025, 5:38am

Actually it is the same. The filebeat input hasn't changed.

davetayl · March 21, 2025, 5:43am

Here are all the logstash configs

root@ip-172-18-12-64:/etc/logstash/conf.d# ls
100-input-beats.conf              100-input-file.conf          300-filter-beats_metadata.conf
100-input-dead_letter_queue.conf  299-filter-dead_letter.conf  500-output-sqs.conf
root@ip-172-18-12-64:/etc/logstash/conf.d# cat 100-input-beats.conf 
input {
  beats {
    include_codec_tag => false
    port => 5044
  }
}
root@ip-172-18-12-64:/etc/logstash/conf.d# cat 100-input-dead_letter_queue.conf 
input {
  dead_letter_queue {
    path => "/var/lib/logstash/dead_letter_queue"
    tags => ["_dead_letter"]
  }
}
root@ip-172-18-12-64:/etc/logstash/conf.d# cat 100-input-file.conf 
input {
  file {
    add_field => {
      "application" => "logstash"
      "role" => "logstash_app"
      "profile" => "shipper_beats"
    }
    path => "/var/log/logstash/logstash-json.log"
    type => "json"
  }
}
root@ip-172-18-12-64:/etc/logstash/conf.d# cat 299-filter-dead_letter.conf 
filter {
  if "_dead_letter" in [tags] {
    ruby {
      code => "event.set('original', event.to_json())"
    }

    prune {
      whitelist_names => [ "original", "type", "tags", "@timestamp", "reason" ]

      add_field => {
        "reason" => "%{[@metadata][dead_letter_queue][reason]}"
      }
    }
  }
}
root@ip-172-18-12-64:/etc/logstash/conf.d# cat 300-filter-beats_metadata.conf 
filter {
  # The `beats` input copies the `beat.hostname` field to the `host` field,
  # //unless// the `host` field exists in the input event. This is generally
  # undesirable because it results in data duplication, but it can also cause
  # mapping conflicts with Beats 6.3.0 (see
  # https://github.com/elastic/beats/issues/7050 and
  # https://github.com/elastic/beats/pull/7051).
  mutate {
    remove_field => ["host"]
  }
}
root@ip-172-18-12-64:/etc/logstash/conf.d# cat 500-output-sqs.conf 
output {
  sqs {
    batch_events => 250
    multiplex => "true"
    queue => "<redacted>"
    region => "<redacted>"
  }
}

leandrojmp · March 21, 2025, 5:55am

Yeah, I think that I understood now.

From what I was able to troubleshoot from the log you shared, the message seems to be coming from a filebeat using the system module, can you confirm this? Just check into /etc/filebeat/modules.d/ I think and see if the system module is enabled.

This is your log, and seems to be consistent with what would be collect by the syslog dataset of the system module.

{
  "@version": "1",
  "ecs": {
    "version": "1.12.0"
  },
  "@timestamp": "2025-03-21T00:49:40.896Z",
  "service": {
    "type": "system"
  },
  "log": {
    "file": {
      "path": "/var/log/syslog"
    },
    "offset": 88292325
  },
  "agent": {
    "ephemeral_id": "5f49041f-3707-4d99-b21f-3502af34d7e7",
    "name": "<redacted>",
    "id": "<redacted>",
    "version": "8.17.3",
    "type": "filebeat"
  },
  "aws": {
    "tags": {
      "roles": "<redacted>",
      "stack": "<redacted>",
      "cluster": "<redacted>",
      "environment": "<redacted>",
      "product": "<redacted>",
      "Name": "<redacted>"
    }
  },
  "cloud": {
    "machine": {
      "type": "t3.xlarge"
    },
    "service": {
      "name": "EC2"
    },
    "image": {
      "id": "<redacted>"
    },
    "provider": "aws",
    "region": "<redacted>",
    "account": {
      "id": "<redacted>"
    },
    "instance": {
      "id": "<redacted>"
    },
    "availability_zone": "<redacted>"
  },
  "fileset": {
    "name": "syslog"
  },
  "event": {
    "timezone": "+00:00",
    "dataset": "system.syslog",
    "module": "system"
  },
  "input": {
    "type": "log"
  },
  "message": "Mar 21 00:49:40 ip-172-18-15-146 consul[353]: 2025-03-21T00:49:40.275Z [WARN]  agent: Check is now critical: check=service:elasticsearch-exporter:2"
}

The main issue here is that there are a lot of changes between version 6 and 8, Filebeat version 8 will add some fields using ECS and those fields can conflict if you where using some of them before, which seems to be the case of the service field.

So you would need to remove those fields in Logstash using a mutate filter.

Something like this:

filter {
    mutate {
        remove_field => ["service"]
    }
}

davetayl · March 24, 2025, 12:43am

Ok great, thank you I'll investigate that and see how I go.

davetayl · March 24, 2025, 4:43am

Thank you for the assistance.

Here is the final config that worked for me with Filbeat 8.17.5 sending logs to Logstash 6.5.23 and Elasticsearch 7.7.1

/etc/logstash/conf.d/100.input.beats.conf

input {
  beats {
    include_codec_tag => false
    port => 5044
  }
}

filter {
  mutate {
    remove_field => "[service]"
    remove_field => "[ecs]"
    remove_field => "[agent]"
    remove_field => "[event]"
    remove_field => "[host]"
    rename => { "[log][offset]" => "offset" }
    rename => { "agent" => "beat" }
    rename => { "[beat][name]" => "[beat][hostname]" }
    rename => { "[beat][type]" => "[beat][name]" }
  }
}

Topic		Replies	Views
Logstash errors after upgrading to filebeat-6.3.0 Beats filebeat	29	16004	July 31, 2018
Since update 6, Filebeat didn't send anything to logstash Beats filebeat	5	1069	May 21, 2018
Filebeat on centos8? Beats filebeat	7	2108	April 17, 2020
Wrong mapping, need to change it \ lost in how to and documentation Elasticsearch	10	5624	July 5, 2017
After upgrading from FB 5.6.5 to FB 6.5.4, events stopped indexing Beats filebeat	4	476	April 24, 2019

Field remapping for backward compatibility

Related topics