Hopefully this is the right place, I am currently fairly new to the ELK stack so not sure if what I am trying to do in logstash is feasible.
I am consuming a CSV file and want to convert it into a JSON format as followed:
"properties":
{
"date": "2015-09-26T16:33:53",
"origin": "UK",
"status": "SUCCESS"
},
"geometry":
{
"type": "Point",
"coordinates":
[
latitude,
longitude
]
}
I was wondering if longstash had this capability where it can convert the file from a flat format such as a CSV into a GEOJSON format. My logstash config is below, I was hoping I can pass some sort of template to tell it convert the format into the above and then write this out to Elasticsearch. Any advise recommendations would be appreciated.
Alternatively, I was thinking of creating a small app in Java that did the conversion but was hoping logstash had some sort of capability that did this.
Thanks
Regards
Sam
input {
lumberjack {
# The port to listen on
port => 5000
# The paths to your ssl cert and key
ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"
# Set this to whatever you want.
type => "my_data"
}
}
filter {
csv {
columns => [Timestamp,status,latitude,longitude,countryCode,countryName,regionName]
separator => ","
}
date{
match => ["Timestamp", "yyyy-MM-dd HH:mm:ss"]
}
}
output {
elasticsearch {
host => "localhost"
protocol => http
index => my_data
}
}
Have a look at the mutate filter. You can mostly get away with rename operations.
mutate {
rename => {
"Timestamp" => "[properties][date]"
"countryCode" => "[properties][origin]"
"status" => "[properties][status]"
}
}
Oh, and another thing:
columns => [Timestamp,status,latitude,longitude,countryCode,countryName,regionName]
This needs to be:
columns => ["Timestamp", "status", ...]
(It would've been convenient if the csv filter could've created the nested fields you want in the end but I'm not sure that's possible. You can try using the [field][subfield] notation and see what happens.)
Thanks for responding, I could be doing something stupid but get the error below when I add the field type => Feature (snippet code below)
{:timestamp=>"2015-10-08T11:03:58.680000+0100", :message=>"Got error to send bulk of actions: [500] {"error":"IllegalArgumentException[Malformed action/metadata line [1], expected a simple value for field [_type] but found [START_ARRAY]]","status":500}", :level=>:error}
1.) Where you specified in the above response about columns where the content need to be surrounded by quotes, is there a reason for this? As it did work without?
2.) My other mutate where I am adding geometry appears to be incorrect as its failing the configtest, could be that I'm not understanding it properly
mutate {
add_field => [ "geometry" { "co-ordinates" [ "%{latitude} %{longitude}" ] } ]
add_field => {
"type" => "Feature"
}
rename => {
"Timestamp" => "[properties][date]"
}
}
Note: I've placed the mutate code after date in the original code so within the filter.
When you use add_field
for changing the type you actually turn type
into an array with multiple values, which is what Elasticsearch is complaining about.
You can save yourself a lot of trouble by not sending to ES at this point. Use a stdout { codec => rubydebug } }
output until you've verified that the messages look as expected.
- I'm surprised if that worked. I don't know why.
- Yeah, your
add_field
syntax for geometry
is really weird.
Maybe this works (because, again, add_field
for an existing field creates an array):
add_field => ["[geometry][coordinates]", "%{latitude}"]
add_field => ["[geometry][coordinates]", "%{longitude}"]
Thanks, that worked Trying to add a field following the mutate guide where it says newfield => "static value" as per below but trying to add this field type the configtest fails
add_field => { "type" => "Feature" }
Current config below works, adding the above fails:
mutate {
add_field => [ "geometry" { "co-ordinates" [ "%{latitude} %{longitude}" ] } ]
rename => {
"Timestamp" => "[properties][date]"
"countryCode" => "[properties][countryCode]"
"countryName" => "[properties][countryName]"
"regionName" => "[properties][regionName]"
"status" => "[properties][status]"
}
}
add_field => [ "geometry" { "co-ordinates" [ "%{latitude} %{longitude}" ] } ]
Wait, didn't you say the last time that this didn't work (and indeed, I don't understand how it ever could)?
No Magnus that code didn't work, it was me thinking I can add fields using json syntax.
What did work was the following where, the fields were not in quotes
columns => [Timestamp,status,latitude,longitude,countryCode,countryName,regionName]
Currently trying to get this to work: add_field => { "type" => "Feature" } but based on what you said above i'm guessing I send it as an array, like below (haven't tested it as of yet as i'm currently away from my computer)
add_field => [ ["type"] , "Feature"]
Thanks Marcus, with your help I managed to sort out my config