Problem Sorting Unique Values in CSV

Hi,

I've been trying to solve this issue and came to a dead end. I will need some advice for this.

I do have a CSV file. Below is just sample data, but the file has like up to thousand lines. Below is the sample file:

"timestamp","name","type","value","date"
"2019-11-27 05:42:45.000","bill.acme.com","ns","ns-1468.awsdns-55.org","201911"
"2019-11-27 05:42:45.000","bill.acme.com","ns","ns-1777.awsdns-30.co.uk","201911"
"2019-11-27 05:42:45.000","bill.acme.com","ns","ns-258.awsdns-32.com","201911"
"2019-11-27 05:42:45.000","bill.acme.com","ns","ns-638.awsdns-15.net","201911"
"2019-11-27 05:51:28.000","bills-record.global.acme.com","a","10.144.152.62","201911"
"2019-11-27 05:51:28.000","bills-record.global.acme.com","a","10.144.126.177","201911"
"2019-11-27 05:51:28.000","bills-record.global.acme.com","a","10.144.135.205","201911"

What I'm trying to achieve here is, if name,type and date has the same value, the output should be a single name value with multiple IP addresses/string as it's value. However the value of IP/string should not be a duplicate. The values can amount from 1 to n.

I've tried using IF statements but to no avail. I know by using add_field, it will automatically create an array if the field is not, but I can't get it to grab multiple IP addresses.

The sample JSON output I'm hoping is as below:

{
          "type" => "ns",
          "date" => "201911",
    	  "name" => "bill.acme.com",
      "@version" => "1",
     "timestamp" => "2019-11-27 05:42:45.000",
    	 "value" => [
        [0] "ns-1468.awsdns-55.org",
        [1] "ns-1777.awsdns-30.co.uk",
        [2] "ns-258.awsdns-32.com",
        [3] "ns-638.awsdns-15.net"
    ]
}
{
          "type" => "a",
          "date" => "201911",
    	  "name" => "bills-record.global.acme.com",
      "@version" => "1",
     "timestamp" => "2019-11-27 05:51:28.000",
    	 "value" => [
        [0] "10.144.152.62",
        [1] "10.144.126.177",
        [2] "10.144.135.205"
    ]
}

Thanks! :slight_smile:

You can do it using an aggregate filter. Make sure you have --pipeline.workers set to 1 and that you have java execution turned off (either on the command line or with 'java.pipeline_execution: false' in logstash.yml