Object mapping errors

Hi All,

We've run into an issue with Logstash (or could be Filebeat) where it's not sending some log entries to ES.

We're currently running version 7.15.2 of all the Elastic Components, although I haven't been able to upgrade the filebeat index templates (Failed to create alias error).

The real problem is at some point the index template appears to have changed, following that we get an error when Logstash tries to send the log to ES.

[2021-11-24T19:43:47,695][WARN ][logstash.outputs.elasticsearch][main][62e7452dfd34a31b86d1a598a072d020b91ba5ad395f41f076ee391ccfc3761b] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"filebeat-2021.11.24", :routing=>nil}, {"message"=>"Process Id:left-slow-function", "tags"=>["beats_input_codec_plain_applied"], "source_host"=>"fqdn.hostname.com", "input"=>{"type"=>"log"}, "thread_name"=>"https-jsse-nio-443-exec-9", "ecs"=>{"version"=>"1.11.0"}, "exception"=>{}, "@timestamp"=>2021-11-24T08:40:31.059Z, "mdc"=>{"request.method"=>"POST", "central.instanceid"=>"CENTRAL-CENTRAL", "satellite.instanceid"=>"24", "process.id"=>"left-slow-function", "request.millis"=>"1637743231059", "request.url"=>"/api/flows", "process.type"=>"publicApi"}, "class"=>"com.livnapi.v2.satellite.connectorapi.security.InitMDCRequestFilter", "agent"=>{"version"=>"7.15.2", "type"=>"filebeat", "hostname"=>"fqdn.hostname.com", "name"=>"fqdn.hostname.com", "ephemeral_id"=>"60ae65d2-7350-4a73-bc91-5b0eadc5952e", "id"=>"47dc051e-fa39-4949-8366-278cf6764b5e"}, "level"=>"INFO", "line_number"=>73, "method"=>"filter", "logger_name"=>"com.livnapi.v2.satellite.connectorapi.security.InitMDCRequestFilter", "@version"=>1, "log"=>{"file"=>{"path"=>"/log/file/path/satellite.log"}, "offset"=>202796384}, "file"=>"InitMDCRequestFilter.java", "container"=>{"id"=>"api"}, "cloud"=>{"machine"=>{"type"=>"t3.medium"}, "service"=>{"name"=>"EC2"}, "account"=>{"id"=>"AWS.ACC.ID"}, "provider"=>"aws", "region"=>"ca-central-1", "availability_zone"=>"ca-central-1a", "instance"=>{"id"=>"i-EC2InstanceID"}, "image"=>{"id"=>"ami-0db254a15041c2aaa"}}, "host"=>{"hostname"=>"fqdn.hostname.com", "name"=>"fqdn.hostname.com", "os"=>{"version"=>"2", "type"=>"linux", "family"=>"redhat", "name"=>"Amazon Linux", "kernel"=>"4.14.214-160.339.amzn2.x86_64", "codename"=>"Karoo", "platform"=>"amzn"}, "mac"=>["02:48:c9:48:97:30", "02:42:39:22:f0:bf"], "ip"=>["10.102.1.181", "fe80::48:c9ff:fe48:9730", "172.17.0.1"], "architecture"=>"x86_64", "id"=>"ec231a73e612e83bbb4b461ff58e86aa", "containerized"=>false}, "fields"=>{"environment"=>"production"}}], :response=>{"index"=>{"_index"=>"filebeat-2021.11.24", "_type"=>"_doc", "_id"=>"YX8dUX0BktDuV5upfiZZ", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"object mapping for [file] tried to parse field [file] as object, but found a concrete value"}}}}

We also get the same error but with a url field instead.
I also find it interesting that the beats_input_codec_plain_applied tag is being applied when filebeat should be parsing it as a json log entry.

Filebeat input config is:

filebeat.inputs:
- type: log
  enabled: true
  json.keys_under_root: true
  json.overwrite_keys: true
  json.message_key: message
  paths:
    - /log/file/path/**/*
  exclude_files:
    - '._localhost_access_log\.log'
    - '\.gz$'
    - '\.log-[[:digit:]]{8}$'
  fields:
    environment: production

Logstash pipe is:

input {
        beats {
        port => "5044"
                ssl => false
        }
}

filter {
        if "GET /actuator/health HTTP/1." in [message] {
                drop {}
        }
}


output {
  if [@metadata][pipeline] {
    elasticsearch {
      hosts => "localhost:9200"
      manage_template => false
      index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
      pipeline => "%{[@metadata][pipeline]}"
    }
  } else {
    elasticsearch {
      hosts => "localhost:9200"
      manage_template => false
      index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
    }
  }
}

Yes ssl is disabled, it's on the to do list
I'm not sure why the output is configured like that, I inherited the cluster from the previous sysadmin.

Additional information, the Log4J template being used (log4j2-logstash-layout with the LogstashJsonEventLayoutV1.json layout) is an old one and is also on the to do to upgrade. If that turns out to be the issue, a workaround (filtering out the affected fields, i assume) will need to be done until that time.

Thank you in advance for the help.

"mapper_parsing_exception", "reason"=>"object mapping for [file] tried to parse field [file] as object, but found a concrete value"}

That is saying that the event has a [file] field that is a string (or number, or date, or whatever)

"file"=>"InitMDCRequestFilter.java"

but Elasticsearch expects it to be an object with fields nested inside it. A field on a document cannot be a string on some documents and an object on others.

You will have to modify the event to match what Elasticsearch expects. That might be as simple as

mutate { rename => { "[file]" => "[file][name]" } }

or if on some events it is a string and on others an object you might need a conditional test

if ! [file][name] { ... }

or even use ruby to test if event.get("file").is_a? String

Thank you so much Badger. I completely forgot about the possibility of renaming the field.

I like the look of using ruby to match against the field when it contains a string, I think that's the most future proof solution. However I'm very stuck on how to achieve this. Could I ask for your help once again please.

I'm not sure if I should use ruby to match and mutate to change, or if the whole lot can be done in ruby. I haven't found any documentation on how to use ruby to manipulate a field name, or how to use ruby to make the match.

Something like:

filter {
  ruby {
    code => "
      if event.get("file").is_a? String 
        event.set("json.file")
      end
    "
  }
  ruby {
    code => "
      if event.get("url").is_a? String
        event.set("request.url")
      end
    "
  }
}

Thanks again.

You might (or might not) find this post useful, which talks a little more about these mapping issues.

For renaming fields I would lean toward using mutate rather than ruby. For deciding whether you need to do the mutate ruby would work

ruby {
    code => '
        if event.get("file").is_a? String
            event.set("[@metadata][fileIsString]", true)
        end
    '
}
if [@metadata][fileIsString] {
    mutate { ... }
}

Note the single quotes around the code block of the ruby filter, so that you have the option to use string interpolation when needed.

Thank you. I wasn't sure if that would have worked, but was another idea floating in my head.

That post was interesting and filled out my idea of how it was working following on from your first answer in my thread.

I have two fields with this issue, one is [file] and the other is [url]. The [url] one I want to rename to [request][url]. The log entries with [url] won't ever have the [file] field as a string, nor will they have a [request] object. My question is can I:

if [url] { ... }

or will that also match entries that do have [request][url]?

Thanks

That will work. if [url] ... tests for a top-level field, it will not match a nested field.

That makes sense :slight_smile:

Thank you again, Badger.