Hello,
I'm having difficulty figuring out how to fill a nested data type with values from multiple fields from a CSV. For example:
I have a CSV data source that has the following columns:
company_id
company_name
company_alias1
company_alias_description1
company_alias2
company_alias_description2
company_alias3
company_alias_description3
company_alias4
company_alias_description4
company_alias5
company_alias_description5
latitude
longitude
I created an index mapping that looks like this:
curl -XPUT 'localhost:9200/companies_v1/_mapping/company?pretty' -d '
{
"company": {
"properties": {
"company_id": { "type": "string" },
"company_name": { "type": "string" },
"company_alias1": { "type": "string" },
"company_alias2": { "type": "string" },
"company_alias3": { "type": "string" },
"company_alias4": { "type": "string" },
"company_alias5": { "type": "string" },
"geo_coordinates": { "type": "geo_point"},
"company_aliases": {
"type": "nested",
"properties": {
"name": { "type": "string" },
"description": { "type": "string" }
}
}
}
}
}
'
The "company_alias*" fields are optionally, but when they do have values they will always be in sequence. In other words, if the company_alias3 field has a value then I can assume that company_alias2 and company_alias1 also have values. I would like the enumerated "company_alias#" and "company_alias_description#" fields to be paired, so I'd like use the nested data type to do this. (I hope I explained that correctly.)
Here is the code I have thus far:
input {
file {
path => "/data/companies.txt"
type => "company"
start_position => "beginning"
ignore_older => 0
}
}
filter {
csv {
columns => [
"company_id",
"company_name",
"company_alias1",
"company_alias_description1",
"company_alias2",
"company_alias_description2",
"company_alias3",
"company_alias_description3",
"company_alias4",
"company_alias_description4",
"company_alias5",
"company_alias_description5",
"latitude",
"longitude"
]
separator => "|"
skip_empty_columns => true
}
if([company_alias1]) {
# <<<<<<<<<<<<<<<<<<<<<<<< HELP!
}
if([longitude] and [latitude]) {
mutate {
convert => {
"latitude" => "float"
"longitude" => "float"
}
add_field => { "[geo_coordinates][lat]" => "%{latitude}" }
add_field => { "[geo_coordinates][lon]" => "%{longitude}" }
}
}
}
output {
elasticsearch {
action => "index"
hosts => "localhost:9200"
index => "companies_v1"
workers => 1
document_id => "%{company_id}"
}
#stdout {
# codec => rubydebug
#}
}
Any insights would be greatly appreciated. I'm new to Logstash and Elasticsearch.
Thank you!