Logstash Mapping - Duplicate values in nested properties

Hello, can someone please help with mapping when there are more than 1 nested type properties in the mapping? We are using the 8.0 version and using Logstash we are synching the data from our Database to the ES index.

Problem:: I am seeing a duplicate of data getting created in the document for the nested type properties, when in the Logstash Config file I am mapping more than 1 nested type properties. Let me try to explain better with a sample example below.

Index Mapping

PUT test
{
  "settings": {
    "index.mapping.coerce": false
  },
  "mappings": {
    "dynamic": "strict",
    "properties" : {
	"agreementId" : {
          "type" : "text",
          "copy_to" : [
            "primaryFields"
          ]
        },
        "customers" : {
          "properties" : {
            "customerId" : {
              "type" : "keyword",
              "index" : false,
              "doc_values" : false
            },
	    "customerAddresses" : {
              "type" : "nested",
              "properties" : {
                "custAddress" : {
                  "type" : "text"
                },
                "custAddressType" : {
                  "type" : "keyword",
                  "doc_values" : false
                }
              }
            },
	    "phones" : {
              "properties" : {
                "phonenumber" : {
                  "type" : "text",
                  "copy_to" : [
                    "primaryFields"
                  ]
                },
                "phonetype" : {
                  "type" : "keyword",
                  "doc_values" : false
                }
              }
            }
          }
        }
	}
  }
}

In my database, we have an Agreement number as the primary key that can have more than 1 customer profile (Let's use 1 in this scenario). Each customer can have multiple phones and multiple addresses. Based on the query, my output looks something like this

**agreement**   **customer**    **Address**  **Addresstype**  **Contact**  **Contacttype**
  123456879        10            123 Main St.     Mailing     1111111111     Home
  123456789        10            123 Main St.     Mailing     2222222222     Cell
  123456789        10            456 South        Billing     1111111111     Home
  123456789        10            456 South        Billing     2222222222     Cell

When the document is created in the Index, this is how it's looking

{
    "took": 474,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 1,
            "relation": "eq"
        },
        "max_score": null,
        "hits": [
            {
                "_index": "test",
                "_id": "123456789",
                "_score": null,
                "_source": {
                    "agreementId": 123456789,
                    "customers": [
                        {
                            "phones": [
                                {
                                    "phonetype": "Cell",
                                    "phonenumber": "2222222222"
                                },
                                {
                                    "phonetype": "Cell",
                                    "phonenumber": "2222222222"
                                },
                                {
                                    "phonetype": "Home",
                                    "phonenumber": "1111111111"
                                },
                                {
                                    "phonetype": "Home",
                                    "phonenumber": "1111111111"
                                }
                            ],
                            "customerAddresses": [
                                {
                                    "custAddressType": "Mailing",
                                    "custAddress": "123 Main St."
                                },
                                {
                                    "custAddressType": "Billing",
                                    "custAddress": "456 South"
                                },
                                {
                                    "custAddressType": "Mailing",
                                    "custAddress": "123 Main St."
                                },
                                {
                                    "custAddressType": "Billing",
                                    "custAddress": "456 South"
                                }
                            ]
                        }
                    ]
				},
                "sort": [
                    1713679200000
                ]
            }
        ]
    }
}

As you can see, the phones and customer addresses are getting repeated. Here is how the mapping is defined in the Config file.

Config Mapping

aggregate {
        task_id => "%{agreement}"
        code => "
                        map['agreementId'] = event.get('agreement')                       
                        
                         map['customers'] ||= []
                        if (event.get('customer') != nil)

                                customer_found = false
                                map['customers'].each { |cus|
                                        if cus['customerId'] == event.get('customer')
                                                customer_found = true
                                        end
                                }

                                if !customer_found
                                        map['customers'] << {
                                        'customerId' => event.get('customer')                          
                                        }
                                end
                                
                                map['customers'].each { |cus|
                                        if cus['customerId'] == event.get('customer') && event.get('Contact') != nil
                                                cus['phones'] ||=[]
                                                cus['phones'] << {
                                                'phonenumber' => event.get('Contact'),
                                                'phonetype' => event.get('Contacttype'),
                                                }
                                        end
                                }
                                
                                map['customers'].each { |cus|
                                        if cus['customerId'] == event.get('customer_id') && event.get('Address') != nil
                                                cus['customerAddresses'] ||=[]
                                                cus['customerAddresses'] << {
                                                'custAddress' => event.get('Address'),
                                                'custAddressType' => event.get('Addresstype'),
                                                }
                                        end
                                }
                        end
                                       
                        event.cancel()
            "
             push_previous_map_as_event => true
             timeout => 5
             timeout_tags => ['aggregated']
    }
    if "aggregated" not in [tags] {
            drop {}
        }
}

I even tried mentioning the "Phones" properties as Nested type, but no luck in the duplication.

@stephenb

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.