Hello, can someone please help with mapping when there are more than 1 nested type properties in the mapping? We are using the 8.0 version and using Logstash we are synching the data from our Database to the ES index.
Problem:: I am seeing a duplicate of data getting created in the document for the nested type properties, when in the Logstash Config file I am mapping more than 1 nested type properties. Let me try to explain better with a sample example below.
Index Mapping
PUT test
{
"settings": {
"index.mapping.coerce": false
},
"mappings": {
"dynamic": "strict",
"properties" : {
"agreementId" : {
"type" : "text",
"copy_to" : [
"primaryFields"
]
},
"customers" : {
"properties" : {
"customerId" : {
"type" : "keyword",
"index" : false,
"doc_values" : false
},
"customerAddresses" : {
"type" : "nested",
"properties" : {
"custAddress" : {
"type" : "text"
},
"custAddressType" : {
"type" : "keyword",
"doc_values" : false
}
}
},
"phones" : {
"properties" : {
"phonenumber" : {
"type" : "text",
"copy_to" : [
"primaryFields"
]
},
"phonetype" : {
"type" : "keyword",
"doc_values" : false
}
}
}
}
}
}
}
}
In my database, we have an Agreement number as the primary key that can have more than 1 customer profile (Let's use 1 in this scenario). Each customer can have multiple phones and multiple addresses. Based on the query, my output looks something like this
**agreement** **customer** **Address** **Addresstype** **Contact** **Contacttype**
123456879 10 123 Main St. Mailing 1111111111 Home
123456789 10 123 Main St. Mailing 2222222222 Cell
123456789 10 456 South Billing 1111111111 Home
123456789 10 456 South Billing 2222222222 Cell
When the document is created in the Index, this is how it's looking
{
"took": 474,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "test",
"_id": "123456789",
"_score": null,
"_source": {
"agreementId": 123456789,
"customers": [
{
"phones": [
{
"phonetype": "Cell",
"phonenumber": "2222222222"
},
{
"phonetype": "Cell",
"phonenumber": "2222222222"
},
{
"phonetype": "Home",
"phonenumber": "1111111111"
},
{
"phonetype": "Home",
"phonenumber": "1111111111"
}
],
"customerAddresses": [
{
"custAddressType": "Mailing",
"custAddress": "123 Main St."
},
{
"custAddressType": "Billing",
"custAddress": "456 South"
},
{
"custAddressType": "Mailing",
"custAddress": "123 Main St."
},
{
"custAddressType": "Billing",
"custAddress": "456 South"
}
]
}
]
},
"sort": [
1713679200000
]
}
]
}
}
As you can see, the phones and customer addresses are getting repeated. Here is how the mapping is defined in the Config file.
Config Mapping
aggregate {
task_id => "%{agreement}"
code => "
map['agreementId'] = event.get('agreement')
map['customers'] ||= []
if (event.get('customer') != nil)
customer_found = false
map['customers'].each { |cus|
if cus['customerId'] == event.get('customer')
customer_found = true
end
}
if !customer_found
map['customers'] << {
'customerId' => event.get('customer')
}
end
map['customers'].each { |cus|
if cus['customerId'] == event.get('customer') && event.get('Contact') != nil
cus['phones'] ||=[]
cus['phones'] << {
'phonenumber' => event.get('Contact'),
'phonetype' => event.get('Contacttype'),
}
end
}
map['customers'].each { |cus|
if cus['customerId'] == event.get('customer_id') && event.get('Address') != nil
cus['customerAddresses'] ||=[]
cus['customerAddresses'] << {
'custAddress' => event.get('Address'),
'custAddressType' => event.get('Addresstype'),
}
end
}
end
event.cancel()
"
push_previous_map_as_event => true
timeout => 5
timeout_tags => ['aggregated']
}
if "aggregated" not in [tags] {
drop {}
}
}
I even tried mentioning the "Phones" properties as Nested type, but no luck in the duplication.