Regarding illegal_argument_exception and empty value

Hi all,

I got many warning messages like this

 [2019-06-21T10:34:01,663][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"cv_threat_event_all_201812", :_type=>"_doc", :routing=>nil}, #<LogStash::Event:0x6592b850>], :response=>{"index"=>{"_index"=>"cv_threat_event_all_201812", "_type"=>"_doc", "_id"=>"7tQae2sBhpHZbQ7dLP79", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", 
"reason"=>"failed to parse field [begin_time] of type [date] in document with id '7tQae2sBhpHZbQ7dLP79'", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"failed to parse date field [HTTP] with format [EEE dd MMM yyyy HH:mm:ss z]", "caused_by"=>{"type"=>"date_time_parse_exception", "reason"=>"Text 'HTTP' could not be parsed at index 0"}}}}}}

Which indicate that it is trying to prase "HTTP" to type "date" which definitely won't work.

So I look into the input csv file and realize that I have some missing value.

The correct file should be the following.

timestamp,device_sn,action_id,app_id,app_name,begin_time
Sat 01 Dec 2018 01:48:12 GMT,ABC111,1,15,HTTP,Sat 01 Dec 2018 01:46:20 GMT

but some of the files are missing the app_name

Sat 01 Dec 2018 04:18:09 GMT,261030KSA2670697,6,0,,Sat 01 Dec 2018 04:17:20 GMT 
Sat 01 Dec 2018 18:58:03 GMT,261030KSA2670697,6,0,,Sat 01 Dec 2018 18:50:22 GMT

So I assume logstash takes in HTTP as begin_time.

Is there any way to resolve this?

If you are parsing that with a csv filter that should just result in

  "app_name" => nil,

What do your filters look like?

1 Like

Hi Badger,

Here is my csv filter,

csv {
  autodetect_column_names => "true"
  autogenerate_column_names => "true"
  skip_header => "true"
  separator => ","
}

Also for the index template, I have the date detection on.

 "date_detection": true,
 "dynamic_date_formats": ["EEE dd MMM yyyy HH:mm:ss z"]

Can you check using the mapping API whether you have just the 6 fields you expect?

1 Like

Hi Badger,

Actually, that file contains about 50 fields.

And the begin_time is mapped to date.

Here is a part of the return from GET /cv_threat_event_all_201812/_mapping

   "app_name" : {
      "type" : "text",
      "fields" : {
        "keyword" : {
          "type" : "keyword",
          "ignore_above" : 256
        }
      }
    },
    "attack" : {
      "type" : "text",
      "fields" : {
        "keyword" : {
          "type" : "keyword",
          "ignore_above" : 256
        }
      }
    },
    "begin_time" : {
      "type" : "date",
      "format" : "EEE dd MMM yyyy HH:mm:ss z"
    },

Here is the part about attack in my logstash config file.

  dissect {
    mapping => {
      "path" => "/%{}/%{attack}/%{month}-%{}/%{}/%{type}_%{}"
    }
    add_tag => ["%{type}"]
  }

We have many kinds of files in the same folder so we use tag to upload the data to the correct index according to the tag.

Is the tag causing the trouble?

I don't think so. I suggest you set up a DLQ to capture the messages that are failing to get mapped. Then run them through a different pipeline and look at the raw message. Maybe they have an extra comma or something.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.