I'm using Logstash to bulk load into ElasticSearch using a CSV template
I'm getting lots of warnings like:
retrying failed action with response code: 429 {:level=>:warn}
And lots of errors like:
too many attempts at sending event. dropping: 2015-06-16T01:06:22.000Z ip-172-31-30-197 93522,itsmxjunbu01,0,Backup,chile.eesus.jnj.com,"Jun 15, 2015 {:level=>:error} 16, 2015 1:06:22 AM",03:05:28,32,0,3
My command to load the data looks like this:
cat data.csv | /opt/logstash/bin/logstash -f ./csvload.conf
The csvload.conf looks like this:
# read input from stdin (e.g. pipe)
input {
stdin {}
}
filter {
# filter the input by csv (i.e. comma-separated-value)
csv {
columns => [
"JobID",
"ServerName",
"StatusCode",
"JobType",
"ClientName",
"StartTime",
"EndTime",
"Duration",
"Volume-KB",
"NumberofFiles",
"Throughput-KB-sec"
]
}
date {
# parse the "End Time" to create a real date
# Examples of times in this log file
# "May 29, 2015 10:00:01 PM"
# "May 9, 2015 4:47:23 AM"
# "May 23, 2015 12:23:49 PM"
match => [ "EndTime",
"MMM dd, YYYY hh:mm:ss aa",
"MMM d, YYYY hh:mm:ss aa" ] }
mutate { replace => { "type" => "nbu_job" } }
mutate { gsub => ["NumberofFiles", ",", ""] }
mutate { convert => [ "NumberofFiles", "integer" ] }
mutate { gsub => ["Volume-KB", ",", ""] }
mutate { convert => [ "Volume-KB", "integer" ] }
mutate { gsub => ["Throughput-KB-sec", ",", ""] }
mutate { convert => [ "Throughput-KB-sec", "integer" ] }
# Example of Duration = "04:28:13" which is hours, minutes, and seconds
# Split up and create the respective integer fields
grok {
match => [ "Duration", "%{NUMBER:hours:int}:%{NUMBER:minutes:int}:%{NUMBER:seconds:int}" ]
}
# Call ruby to perform the basic arithmetic of computing total seconds
ruby {
code => "event['Elapsed'] = event['hours']*3600 + event['minutes']*60 + event['seconds']"
}
translate {
field => "ServerName"
destination => "Country"
dictionary_path => "./cmdb/ServerByCountry.yaml"
fallback => "Unknown"
}
}
# send the output to stdout, using the rubydebug codec
# rubydedug uses the Ruby Awesome Print library
output {
# stdout { codec => rubydebug }
elasticsearch { host => localhost }
}
I honestly really don't care about the csv import performance. I wish I could figure out a way to slow down the load. Seems like logstash and/or elasticsearch is choking....
-Chad