Hello!
I am trying to aggregate some data from DB by Logstash.
My data in db looks like this:
+----+----------------+---------------------+----------------------+----------------------+
| #  | product_id | product_name | property_name | property_value |
+----+----------------+---------------------+----------------------+----------------------+
| 1  | 100             | pc                     | colour                | black                  |
+----+----------------+---------------------+----------------------+----------------------+
| 2  | 100             | pc                     | colour                | silver                  |
+----+----------------+---------------------+----------------------+----------------------+
| 3  | 100             | pc                     | ram                    | 16Gb                  |
+----+----------------+---------------------+----------------------+----------------------+
| 4  | 100             | pc                     | hdd                    | 200Gb                |
+----+----------------+---------------------+----------------------+----------------------+
| 5  | 101             | printer              | colour                | black                 |
+----+----------------+---------------------+----------------------+----------------------+
| 6  | 101             | printer             | features             | wifi                     |
+----+----------------+---------------------+----------------------+----------------------+
| 7  | 101             | printer             | features             | scanner             |
+----+----------------+---------------------+----------------------+----------------------+
| 8  | 101             | printer             | type                    | mate                  |
+----+----------------+---------------------+----------------------+----------------------+
| 9  | 102             | laptop              | features             | wifi 5Ghz           |
+----+----------------+---------------------+----------------------+----------------------+
| 10 | 102            | laptop              | colour                | white                  |
+----+----------------+---------------------+----------------------+----------------------+
| 11 | 102            | laptop               | hdd                   | 512Gb                |
+----+----------------+---------------------+----------------------+----------------------+
I want to aggregate data by product_id, property_name in the following way:
[
    {
        "id": 100,
        "name": "pc",
        "properties": {
            "colour": [
                "black",
                "silver"
            ],
            "hdd": [
                "200Gb"
            ],
            "ram": [
                "16Gb"
            ]
        }
    },
    {
        "id": 101,
        "name": "printer",
        "properties": {
            "features": [
                "wifi",
                "scanner"
            ],
            "colour": [
                "black"
            ],
            "type": [
                "mate"
            ]
        }
    },
    {
        "id": 102,
        "name": "laptop",
        "properties": {
            "features": [
                "wifi 5Ghz"
            ],
            "colour": [
                "white"
            ],
            "hdd": [
                "512Gb"
            ]
        }
    }
]
For this purpose I am trying to use aggregate filter plugin and read example #4 of docs and this topic.
Here is my logstash.conf (filter part):
filter {
  aggregate {
    task_id => "%{product_id}"
    code => "
      map['product_id'] = event.get('product_id')
      map['product_name'] = event.get('product_name')
      map['properties'] ||= {}
      map[event.get('property_name')] ||= []                                         
      map[event.get('property_name')] << event.get('property_value') 
      event.cancel()
    "
    push_previous_map_as_event => true
    timeout => 3
  }
}
The output of filtering is:
{"product_id":100, "product_name":"pc", "properties":[], "colour":["black","silver"], "ram":["16Gb"], "hdd":["200Gb"]}
{"product_id":101, "product_name":"printer", "properties":[], "colour":["black"], "type":["mate"], "features":["wifi","scanner"]}
{"product_id":102, "product_name":"laptop", "properties":[], "colour":["white"],"hdd":["512Gb"], "features":["wifi 5Ghz"]}
But I need that properties (like "colour", "ram", "hdd") will be inside "properties" field.
For this purpose I tried to use
map['properties'] ||= []
map['properties'] << {
   map[event.get('property_name')] ||= []                                         
   map[event.get('property_name')] << event.get('property_value') 
}
But that doesn't work.
I'm not familiar with that syntax, so any idea how to put properties (like "colour", "ram", "hdd") inside "properties" ? Am I missing something?
Thank you!