Failed to parse error for certain records

Good morning all, I am ingesting some logs and while specific records make it over, I am seeing errors in the logs and they are not making it over to Kibana outside of your standard fields (doc_id, index, time, etc) I have provided my conf.d below along with the errors and logs that are being used for ingestion. Any information to rectify this would greatly be appreciated.

logstash conf.d

input {
 file {
   path => "/var/log/logstash/casb_storage/proofpoint/10KLines.txt"
   sincedb_path => "/var/log/logstash/sincedb_path/proofpoint_pos_file"
   start_position => "end"
   mode => "tail"

filter {
  grok { match => { "message" => "\[(?<timestamp>[^\]]+)\] %{GREEDYDATA:proofpoint}" }

   kv {
     source => "proofpoint"
     field_split => " "
     include_brackets => true
     recursive => "true"
     value_split => "="
     whitespace => "strict"

date { locale => "en"
match => [ "timestamp","YYYY-MM-dd HH:mm:ss.SSSSSS ZZ" ] }

   mutate {
     remove_field => [ "proofpoint","message", "syslog_hostname","path","@index","@version","host","port","tags" ]

output {
   elasticsearch {
     hosts => [ "http://es.server:9200" ]
     user =>  "elastic"
     password => "elastic_test_p@ssw0rd"
     index => "proofpoint-%{+YYYY.MM}"

Errors in logs (this is repetitive for a lot of different fields)

[    WARN ] 2019-06-07 10:31:18.198 [[main]>worker8] elasticsearch - Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"proofpoint-2019.06", :_type=>"_doc", :routing=>nil}, #<LogStash::Event:0x67d691f3>], :response=>{"index"=>{"_index"=>"proofpoint-2019.06", "_type"=>"_doc", "_id"=>"GD6QMmsBh_NbzFsD0AjZ", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse field [dict] of type [text] in document with id 'GD6QMmsBh_NbzFsD0AjZ'", "caused_by"=>{"type"=>"illegal_state_exception", "reason"=>"Can't get text on a START_OBJECT at 1:102"}}}}}
    [WARN ] 2019-06-07 10:31:18.199 [[main]>worker5] elasticsearch - Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"proofpoint-2019.06", :_type=>"_doc", :routing=>nil}, #<LogStash::Event:0x5824460d>], :response=>{"index"=>{"_index"=>"proofpoint-2019.06", "_type"=>"_doc", "_id"=>"fD6QMmsBh_NbzFsD0AXK", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"object mapping for [version] tried to parse field [version] as object, but found a concrete value"}}}}

Logs for ingestion-

[2019-06-06 17:06:18.402388 -0400] rprt s=2sunues3yv m=1 x=2sunues3yv-1 mod=av cmd=run rule=clean vendor=fsecure version="vendor=fsecure engine=2.50.10434:,, definitions=2019-06-06_14:,, signatures=0" duration=0.000
[2019-06-06 17:08:03.667159 -0400] info mod=regulation type=mail cmd=refresh id=0 action=load dict=CJ27_ExtClass_B file=/opt/server/pps-

One of those messages results in

   "version" => {
         "engine" => "2.50.10434:,,",
    "definitions" => "2019-06-06_14:,,",
     "signatures" => "0",
         "vendor" => "fsecure"

and the other in

      "dict" => "CJ27_ExtClass_B",

I believe those errors are trying to tell you that you have already ingested documents in which [dict] is a hash (like [version] is above), and in which [version] is a string (like [dict] is above). A field cannot be both -- once its type is set elasticsearch cannot map documents that have the wrong type.

Thanks @Badger. Would it be safe to assume this could be tied to my template that I created? Should I explicitly map these fields? or clear the index and try to re-create the index with a modified dynamic template?

No, that is not a safe assumption. If the template is wrong it could cause this. For example, if it specifies that version is a string when it is always an object. However, it is also possible that you data structure is inconsistent, so that version (and dict) are sometimes strings and sometimes objects.

If that is the case then you would need a ruby filter that would check to see if the value of dict/version is an instance of a string, and if so, replace it with a hash that contains the value. Something like

    ruby {
        code => '
            [ "dict", "version" ].each { |x|
                xvalue = event.get(x)
                if xvalue and xvalue.kind_of?(String)
                    event.set(x, { x => xvalue })

Thanks Badger-I will work on this and see if i can determine where the issue with this is based off of your feedback. I will tell you for now that the data structure for these logs are inconsistent thus making my life a bit difficult so I do appreciate the help as always.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.