Hitting some sort of field type mismatch after upgrading filebeat to 7.x

Our setup looks like:

filebeat -> logstash -> elasticsearch

After upgrading some of our servers from filebeat 5.x to filebeat to 7.x we seem to be hitting some sort of field type mismatch. We use daily indexes using the logstash-* pattern. When the daily index flips I start getting these errors:

[2020-03-03T17:09:36,070][DEBUG][o.e.a.b.TransportShardBulkAction] [ps-dev-elk] [logstash-2020.03.03][0] failed to execute bulk item (index) index {[logstash-2020.03.03][_doc][fuFfoXABQ1tpPHB50Pa-], source[{"name":"Other","referrer":""-"","@version":"1","os":"Other","verb":"GET","message":"127.0.0.1 - - [03/Mar/2020:17:09:34 +0000] "GET /server-status?auto HTTP/1.1" 200 1488 "-" "-" "-"","clientip":"127.0.0.1","request":"/server-status?auto","os_name":"Other","agent":""-"","type":"apache","timestamp":"03/Mar/2020:17:09:34 +0000","auth":"-","source":"/var/log/apache2/benefits.log","beat":{"version":"5.6.16","name":"ps-partner-dev-web01.domain.com","hostname":"ps-partner-dev-web01.domain.com"},"bytes":1488,"httpversion":"1.1","response":"200","input_type":"log","device":"Other","build":"","offset":4838529,"requestid":""-"","ident":"-","@timestamp":"2020-03-03T17:09:34.000Z","tags":["beats_input_codec_plain_applied","internalIP"]}]}
org.elasticsearch.index.mapper.MapperParsingException: object mapping for [agent] tried to parse field [agent] as object, but found a concrete value

The agent field is coming from apache logs. Using the built in HTTPD_COMBINEDLOG grok, it sets the agent field to a QS (quoted string).

Deleting the daily index and letting it get re-created the errors stop and logs start flowing in.

I checked the mapping for our index in kibana and agent is set to string.

I turned on rubydebug output in elasticsearch to compare the logs from a host running filebeat 5.x and a host running filebeat 7.x and the output looks the same:

5.x:
"agent" => ""-"",

7.x:
"agent" => ""curl/7.47.0"",

looks the same to me.

I checked out logstash-* template and there is no mention of agent so it must be dynamically creating this from input. I can't figure out where or why it's trying to set this field to an object. Any help would be much appreciated.

This continues to happen on a daily basis. I'm struggling to find a solution.

This morning the daily index flipped and it started error'ing out. I took a look at the agent field in the new index and it shows this:

          "properties" : {
            "ephemeral_id" : {
              "type" : "text",
              "norms" : false,
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },

I deleted the index, which causes it to get recreated and gets everything working again. Looking at the agent field on the newly created index shows:

          "type" : "text",
          "norms" : false,
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },

I'm not sure where the ephemeral_id stuff is coming from. I think that is coming from filebeat which was sending an agent field related to filebeat but I added a config in my filebeat.yml to rename this to field filebeat_agent.

Here is my logstash template, you can see it's not coming from there:

root@ps-dev-elk:/var/log/elasticsearch# curl -X GET "localhost:9200/_template/logstash*?pretty"
{
  "logstash" : {
    "order" : 0,
    "version" : 60001,
    "index_patterns" : [
      "logstash-*"
    ],
    "settings" : {
      "index" : {
        "number_of_shards" : "1",
        "refresh_interval" : "5s"
      }
    },
    "mappings" : {
      "dynamic_templates" : [
        {
          "message_field" : {
            "path_match" : "message",
            "mapping" : {
              "norms" : false,
              "type" : "text"
            },
            "match_mapping_type" : "string"
          }
        },
        {
          "string_fields" : {
            "mapping" : {
              "norms" : false,
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "ignore_above" : 256,
                  "type" : "keyword"
                }
              }
            },
            "match_mapping_type" : "string",
            "match" : "*"
          }
        }
      ],
      "properties" : {
        "@timestamp" : {
          "type" : "date"
        },
        "geoip" : {
          "dynamic" : true,
          "properties" : {
            "ip" : {
              "type" : "ip"
            },
            "latitude" : {
              "type" : "half_float"
            },
            "location" : {
              "type" : "geo_point"
            },
            "longitude" : {
              "type" : "half_float"
            }
          }
        },
        "@version" : {
          "type" : "keyword"
        }
      }
    },
    "aliases" : { }
  }
}

I'm going to try dropping all of the agent.* fields that are getting sent by filebeat. I'm not sure this is going to help but I guess it's worth a shot. Won't know until tomorrow when the daily index flips.

Dropping the fields from filebeat did not help, as soon as the daily index flips I start getting these errors again:

Mar  6 16:52:14 ps-dev-elk logstash[100874]: [2020-03-06T16:52:14,788][WARN ][logstash.outputs.elasticsearch][main] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"logstash-2020.03.06", :routing=>nil, :_type=>"_doc"}, #<LogStash::Event:0x451a9318>], :response=>{"index"=>{"_index"=>"logstash-2020.03.06", "_type"=>"_doc", "_id"=>"U7LDsHABQ1tpPHB5Aco7", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"object mapping for [agent] tried to parse field [agent] as object, but found a concrete value"}}}}

There was a suggestion from #logstash that this must be happening dynamically from log input. Here's the output from filebeat outputting to console that shows it's no longer sending any agent.* fields. I can't figure out why or where this is happening but it doesn't seem like it's coming from log input.

{
  "@timestamp": "2020-03-06T18:28:07.540Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "7.6.0"
  },
  "input": {
    "type": "log"
  },
  "type": "apache",
  "ecs": {
    "version": "1.4.0"
  },
  "host": {
    "name": "ps-dev-web01.domain.com"
  },
  "log": {
    "offset": 4292777,
    "file": {
      "path": "/var/log/apache2/benefits.log"
    }
  },
  "message": "::1 - - [06/Mar/2020:18:28:07 +0000] \"GET /server-status?auto HTTP/1.1\" 200 1365 \"-\" \"-\" \"-\""
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.