Pivot the field using csv data in LS

Hi All

I have a csv data like this

Object#,Type,Count
A001,P0,5
A002,P1,3
A003,P0,4
A005,P1,2
A006,P2,2

I need to output in this below format, i tried with ruby filter, but not sure which gem to use exactly

Please let me know if someone has done something similar, using ruby filter

image

I have input as .csv file and need output as .csv file

Thanks

So you want the number from the second column to determine which output column the third input column appears in?

Yes, the values from second column, will become the column header and the value of third column will be segregated accordingly.

Any idea how we can do this?

Thanks

    csv { autodetect_column_names => true remove_field => [ "message" ] }
    if [Type] == "P0" {
        mutate { add_field => { "P0" => "%{Count}" } }
    } else {
        mutate { add_field => { "P0" => "" } }
    }
    if [Type] == "P1" {
        mutate { add_field => { "P1" => "%{Count}" } }
    } else {
        mutate { add_field => { "P1" => "" } }
    }
    if [Type] == "P2" {
        mutate { add_field => { "P2" => "%{Count}" } }
    } else {
        mutate { add_field => { "P2" => "" } }
    }

Hi Thanks for you reply.

I am looking for some dynamic solution. With the above i have given a example wherein i have 3 values as P0, P1, P2 to ask question in a simple manner

In my real dataset the value of Type goes to between 50 -70 approx.
This keep changing on monthly basis. So i am looking for some dynamic solution
With every pipeline run the values from Type , will become the column

i have tried this gem , but not sure how to really construct with ruby filter

The above solution that you provided is more a hardcoded data with value as P0,P1 and P2

Thanks

The csv output will output empty fields is the event is missing them, so all you really need is

    csv { autodetect_column_names => true remove_field => [ "message" ] }
    ruby { code => 'event.set(event.get("Type"), event.get("Count"))' }

The fields option on the filter is not optional, but you could use a technique similar to this to find the set of fields required.

Hi @Badger

I tried this but it seems i am getting correct output in stdout but not in csv. This below i tried

  1. I am not able to generate the csv fields dynamically ( the mandatory one fields => [" ", " ".....]
  2. I am not able to give headers to the csv output.

Any insight, really appreciate
Filter

ruby {
code => '
csvFields =
event.to_hash.each { |k, v|
csvFields << k
}
event.set("[csvFields]", csvFields.to_s)
'
}
}

Output

output {
csv {
path => "....................\output.csv"
fields => ["%{csvFields}"]

}
}

As I noted in the post I linked to, you cannot use either %{[fieldname]} or ${envVar} in the fields option for a csv output. The idea is to run the complete data set through logstash to generate the value required for that option, then you can copy and paste it.

filter {
    csv { autodetect_column_names => true }
    ruby {
        init => '@csvFields = []'
        code => '
            t = event.get("Type")
            unless @csvFields.include? t
                @csvFields << t
            end
            event.set("csvFields", @csvFields.to_s)
        '
    }
}
output { stdout { codec => plain { format => "%{csvFields}
" } } }

will produce

["P0"]
["P0", "P1"]
["P0", "P1"]
["P0", "P1"]
["P0", "P1", "P2"]

Just take the last line and paste it into your config

Hi @Badger , thanks for replying back. I tried again multiple times, looks like i am missing something.
Please checkout this my complete file. Please suggest whats wrong here, i am not a big expert of ruby.
Are you saying i need 2 outputs one on csv and other stdout, i am not clear on it

input{
file {
path => "......./cumsum.csv"
sincedb_path => "......../cumsum.txt"
start_position => "beginning"
type => "csv"
}
}

filter {
csv {
autodetect_column_names => true
remove_field => [ "message" ]
}
ruby {
code => '
csvFields =
event.to_hash.each { |k, v|
csvFields << k
}
event.set("[csvFields]", csvFields.to_s)
'
}
ruby {
code => 'event.set(event.get("Type"), event.get("Count"))'
}
}

output {
file {
path => ".........\output.csv"
codec => plain { format => "%{csvFields}" }
}
}

I am getting this response in the output.csv file

`

["A001", "type", "@timestamp", "5", "P0", "@version", "path", "host"]["A001", "type", "@timestamp", "5", "P0", "@version", "path", "host"]["A001", "type", "@timestamp", "5", "P0", "@version", "path", "host"]["A001", "type", "@timestamp", "5", "P0", "@version", "path", "host"]

`

You have input data that contains an unknown set of values in the [Type] field. The idea is that we run that data through logstash in order to find what that set of values is. For that you should use the stdout output.

The last line of output will be a string like

["P0", "P1", "P2"]

which you can copy and paste into the fields option of a csv output.

okay, i was looking for full automation through pipeline, here there will be manual intervention on copying the fields for csv output.

Also, i was looking for headers in csv.

Anyways thanks for your help, i will look for some other solution

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.